M
Matthew Diakonov

Social media automation software that learns which comment style wins

The listicles for this keyword all measure automation by the same grid: post calendar, AI caption, unified inbox, analytics rollup. S4L is measured by a question none of them ask. When the loop is about to write a comment, which of seven engagement styles should it speak in, on this specific platform, given how past comments performed? The answer lives in a single column called posts.engagement_style and a SQL query that rewrites the next prompt.

4.9from operators running multi-platform engagement loops
7 engagement styles tagged on every reply
posts.engagement_style column backs the SQL
MIN_SAMPLE_SIZE=5 before a style is trusted
PRIMARY / SECONDARY / RARE tiers rewritten per prompt
Per-platform never-lists override the data

Most "automation software" automates the calendar, not the voice

The top pages for this keyword rotate through the same cast. They differ on seat pricing and which inbox view they ship. Every single one treats the body of the reply as a single prompt or a single template, and none of them measures whether the voice that prompt produces is actually what this platform rewards.

HootsuiteBufferSocialBeeSprinklrSendibleEclincherPallyyAgorapulseHeyOrcaZapierLaterMetricool

S4L sits in a different category. It does not hand you a content calendar. It hands you a generator that picks one of seven styles for each reply, writes that choice back to a Postgres row, and re-ranks the styles before the next comment is written.

The seven styles, exactly as shipped

Each style has a description, an example line, a note that narrows when it fires, and a best-in map per platform. The taxonomy lives in the STYLES dict at the top of scripts/engagement_styles.py. It is not tuned by the model, it is tuned by whoever ships the repo.

critic

Point out what is missing, flawed, or naive. Reframe the problem. 'The part that breaks down is...' Note: never just nitpick, offer a non-obvious insight.

storyteller

Pure first-person narrative with specific details (numbers, dates, names). Lead with failure or surprise. 'we tracked this for six months and found...' Note: never pivot to a product pitch.

pattern_recognizer

Name the pattern or phenomenon. Authority through pattern recognition, not credentials. 'This is called X. I have seen this play out dozens of times across Y.'

curious_probe

One specific follow-up question about the most interesting detail. Include 'curious because...' context. Banned on Reddit per PLATFORM_POLICY.

contrarian

Take a clear opposing position backed by experience. 'Everyone recommends X. I have done X for Y years and it is wrong.' Must have credible evidence. Empty hot takes get destroyed.

data_point_drop

Share one specific, believable metric. Let the number do the talking. '$12k in a month (not a lot of money).' No links. Numbers must be believable, not impressive.

snarky_oneliner

Short, sharp, emotionally resonant observation (one sentence max). Validates a shared frustration. Banned on LinkedIn and GitHub. Never in small or serious subs.

The anchor fact: one SQL query ranks the styles

Before every prompt, the generator calls _fetch_style_stats(platform). That function runs the aggregation below against the Neon Postgres instance backing the repo. N is the number of posts logged with that style on that platform. avg_up is their average final upvote count.

scripts/engagement_styles.py · _fetch_style_stats

SELECT engagement_style, COUNT(*) AS n, AVG(COALESCE(upvotes,0))::float AS avg_up FROM posts WHERE status='active' AND LENGTH(our_content) >= 30 AND platform = %s GROUP BY engagement_style

The 30-character filter drops drafts and placeholder rows so they cannot poison the ranking. The per-platform filter means each platform is graded against its own history. A style that lost on Reddit can still be PRIMARY on LinkedIn.

MIN_SAMPLE_SIZE = 5tier split = thirdsPRIMARY ~60%SECONDARY ~30%RARE ~10%
scripts/engagement_styles.py

The feedback loop, end to end

Every reply flows through the same pipeline. The generator picks a style, the row lands in the posts table, the upvote count fills in later, and the next prompt reads the aggregation before choosing again.

style feedback pipeline

Reddit reply generator
Twitter reply generator
LinkedIn reply generator
GitHub reply generator
posts.engagement_style (Postgres)
_fetch_style_stats (SQL AVG)
get_dynamic_tiers (thirds split)
PLATFORM_POLICY.never filter
Next prompt's style menu

The tier split, not hand-wavy

Trusted styles get sorted by avg upvotes, then sliced into thirds. The top third is PRIMARY, the bottom is RARE, and everything else (plus any untrusted styles with N under 5) becomes SECONDARY. Two edge cases are handled explicitly.

scripts/engagement_styles.py

The numbers that steer the generator

Every constant below is a hard-coded value in engagement_styles.py. Changing any of them changes the voice distribution across the whole pipeline.

0
Engagement styles in the taxonomy
0
MIN_SAMPLE_SIZE before a style is trusted
0%
Budget share for PRIMARY styles
0
Platforms with dedicated policy
0
Character floor before a row counts in the SQL
0
Tiers the prompt describes (PRIMARY, SECONDARY, RARE)
0
Platforms that ban snarky_oneliner
0
SQL query per prompt build

Tone policy overrides the data

Per-platform rules are not tuned by performance. They are hard coded because LinkedIn is not Reddit and no amount of past upvote data will change that. A style with the best avg_up on LinkedIn still gets dropped if it is in the never list.

scripts/engagement_styles.py
Featurenever (hard filter)note (tone hint)
redditnever: curious_probenote: short wins, start with 'I' or 'my'
twitternever: (none)note: brevity wins, direct mentions OK, 1-2 sentences
linkedinnever: snarky_onelinernote: professional, softer critic, 2-4 sentences
githubnever: snarky_onelinernote: technical and specific, 400-600 chars
moltbooknever: (none)note: agent voice 'my human', conversational

What this replaces

The typical setup for "social media automation software" is one system prompt and one voice. Every reply sounds similar because the generator has nothing to choose between. S4L trades that for a generator that is handed a tiered menu every single time.

One system prompt, one voice. Every reply has the same cadence. Upvote data flows into an analytics panel a human reads once a week. The generator never sees it.

  • Single prompt, no voice variation
  • Upvotes are a dashboard, not a prompt input
  • No tone policy per platform
  • No cold-start exploration

One cycle, step by step

The loop is six stages. The one that most tools skip is stage four: the SQL that turns yesterday's upvotes into today's prompt.

1

Generator picks a style

Before any text is written, the prompt shows a tiered style menu (PRIMARY / SECONDARY / RARE) with a target percentage. The model picks one and writes the reply.

2

Write the row

After the reply posts, reply_db.py inserts a row into the posts table with platform, our_content, engagement_style, upvotes (initially null), and status='active'.

3

Engagement updater fills upvotes

A separate scan loop re-polls each of our posts on its platform and UPDATEs the upvotes column. The longer a row sits, the closer its upvotes track final performance.

4

Next prompt runs the SQL

When the next post or reply is about to be drafted, get_styles_prompt(platform) calls _fetch_style_stats, aggregates by engagement_style, and re-splits the tiers.

5

Never-rules override everything

Right before the prompt is written, PLATFORM_POLICY.never filters the list. High-performing snarky_oneliner still gets dropped on LinkedIn. The data can suggest, it cannot overrule tone.

6

The loop keeps shifting weight

As more rows land with real upvote counts, the top-third shifts. A style that was RARE two weeks ago can drift into PRIMARY when the new posts outperform. Nothing is pinned.

What the prompt actually receives

The string below is exactly what get_styles_prompt("reddit") returns after a recent run. It embeds verbatim into the system prompt before the generator is asked to write a reply. The tier counts and the "use these X% of the time" language are what the model sees.

engagement_styles · reddit · posting context

Why this matters more than another calendar

A calendar answers "when do I post." The harder question, and the one that decides whether a reply gets read or ignored, is "in what voice." Five reasons this is the missing piece.

Why a style feedback loop beats a single prompt

  • A single GPT system prompt is a single voice. Repeated a hundred times, it is a tell.
  • Different threads reward different voices. 'r/ExperiencedDevs' rewards pattern_recognizer. 'r/Meditation' rewards storyteller.
  • The tool that decides what to write should also remember what worked when it wrote before.
  • A closed loop means the voice composition drifts toward whichever style is currently winning on that specific platform.
  • Tone policy still overrides the data. Performance cannot turn LinkedIn into Reddit.
0Styles in the taxonomy
0MIN_SAMPLE_SIZE
0%PRIMARY budget share
0%RARE budget share

S4L vs a calendar-first automation tool

Honest feature-by-feature comparison. A team that already runs a scheduler does not have to rip it out, they add this loop next to it for the comment side of the workflow.

FeatureCalendar-first schedulerS4L (engagement_styles.py)
How the comment voice is chosenOne system prompt. Every reply sounds the same.7 styles ranked by posts.engagement_style × avg upvotes
Does the tool remember what worked?No. Engagement is an analytics dashboard, not a prompt input.Yes. posts.engagement_style column, rewritten into every prompt.
Per-platform tone policyOne tone setting global to the workspace.PLATFORM_POLICY hard-coded never-lists (e.g. no snark on LinkedIn).
Cold-start behaviorNo notion of exploration vs exploitation.All non-banned styles go to SECONDARY until 5 samples exist.
What column decides the next voiceNo equivalent column.posts.engagement_style (indexed, N=count, avg_up=float)
Tier percentages written into the promptNot representedPRIMARY ~60%, SECONDARY ~30%, RARE ~10%
Can you grep the rule for why a voice was picked?You cannot.Yes. engagement_styles.py, get_dynamic_tiers, plus the posts row.
What this page promises, in one sentence

The tool that writes the comment also remembers, per platform, which kind of comment has been working.

It is a small design choice. It is also the one every top-ten listicle on this keyword skips entirely.

See posts.engagement_style on a live account

30 minutes on Cal. Screen share of the posts table, a live run of get_styles_prompt, and the actual tier block the generator is receiving right now.

Book a call

Questions people ask before the first call

What actually makes this different from Hootsuite, Buffer, or SocialBee?

Those tools schedule posts on a calendar and, at best, let an AI draft a caption from a single prompt. S4L tags every reply it sends with one of seven style labels (critic, storyteller, pattern_recognizer, curious_probe, contrarian, data_point_drop, snarky_oneliner) and writes that label into posts.engagement_style. Every new prompt queries the posts table, aggregates avg upvotes per style per platform, and rebuilds the style tier block. None of the top listicles describe a feedback loop on the voice of the comment.

Where is the SQL that ranks styles?

scripts/engagement_styles.py, function _fetch_style_stats. The query is SELECT engagement_style, COUNT(*) AS n, AVG(COALESCE(upvotes,0))::float AS avg_up FROM posts WHERE status='active' AND engagement_style IS NOT NULL AND our_content IS NOT NULL AND LENGTH(our_content) >= 30 AND upvotes IS NOT NULL AND platform = %s GROUP BY engagement_style. It runs on a Neon Postgres instance. The 30-character filter drops draft and placeholder rows from the ranking.

Why 5 samples before a style is trusted?

MIN_SAMPLE_SIZE is set to 5 in engagement_styles.py. Below that, one outlier tweet with 400 upvotes can catapult a style into the PRIMARY tier and starve the others. Styles with fewer than 5 logged posts land in the SECONDARY pool, which the prompt describes as 'mid performers or untested, use these roughly 30% of the time,' so they still get explored without steering the whole batch.

What are the per-platform style bans?

PLATFORM_POLICY hard-codes these: curious_probe is banned on Reddit (reddit comment culture eats weak questions), snarky_oneliner is banned on LinkedIn and GitHub (wrong register for both audiences). These overrides fire regardless of how well a style performed in the aggregation. Tone policy beats data.

How does the tier split work in practice?

get_dynamic_tiers sorts trusted styles by avg_up descending, then slices into thirds. The top third becomes PRIMARY (labeled in the prompt as 'top performers by avg upvotes, use these ~60% of the time'), the middle plus all untrusted styles becomes SECONDARY (~30%), and the bottom third becomes RARE (~10%). If there are only one or two trusted styles, both go to PRIMARY and the RARE tier is empty. Cold start (no trusted styles yet) puts everything in SECONDARY.

Does this work for anything besides Reddit?

Yes. The PLATFORM_POLICY dict has entries for reddit, twitter, linkedin, github, and moltbook, each with its own never-list and tone note. The STYLES dict also marks which subreddits or topic buckets fit each style best. For example, data_point_drop is best in r/Entrepreneur and r/SaaS on Reddit, but on LinkedIn it maps to 'results, case studies' instead.

Does it ever ignore the data?

Three cases. First, per-platform bans in PLATFORM_POLICY.never always override the ranking. Second, if a style has N < 5 samples it is forced into SECONDARY even if its noisy avg_up is high. Third, a recommendation style is only ever available in reply contexts, governed by a separate Tier 1/2/3 link-use rule in the surrounding prompt rather than by the upvote feedback loop.

What is a 'style' in text, roughly?

A style is a voice pattern plus a note. storyteller means 'pure first-person narrative with specific details, lead with failure not success, never pivot to a product pitch.' contrarian means 'take a clear opposing position backed by experience, must have credible evidence, empty hot takes get destroyed.' snarky_oneliner means 'short, sharp, emotionally resonant observation, one sentence max.' The generator picks one per reply; the row in posts.engagement_style remembers which.