M
Matthew Diakonov
13 min read

Social media marketing automation tools that read their own report card before every draft

Every listicle for "social media marketing automation tools" ranks the same dozen schedulers by feature checkboxes. The interesting automation is not on the calendar. It is in the loop that runs a SQL query over your own past posts, strips out the wins that happened to mention your own product, attaches failure reasons to the losers, and pastes the whole annotated scoreboard into the next claude -p prompt before the model drafts the next reply.

4.9from 33
1 SQL query per cycle
9 product names stripped
8 failure-reason branches
5 per-platform floors
0 views in the score

The SERP describes broadcasting tools. This is a learning tool.

Search "social media marketing automation tools" and the first page is Sprout Social, Hootsuite, Buffer, Sendible, SocialBee, Eclincher, Sprinklr, Gumloop, Jotform, taap.bio, Appy Pie. All different authors, all the same category: a calendar grid plus an AI caption button plus a mentions inbox. Those tools answer "when should I post what". They do not answer "what should the next reply learn from the last fifty replies". The second question is what this page is about.

SCORE_SQL = comments*3 + (upvotes - 1 on reddit)PLATFORM_MIN_SCORE['reddit'] = 10PLATFORM_MIN_SCORE['linkedin'] = 3PRODUCT_NAMES length = 9has_anti_pattern() strips URLs + brand namesannotate_failure() 8 reason branchesORDER BY avg_cm DESC, avg_up DESClimit * 3 fetched, then filteredviews deliberately excludedreport re-runs every 6h launchd cadenceinjected at line 150 of engage.sh

The anchor fact: a composite score you can read in 9 lines

Everything on this page points to one SQL expression, and that expression lives near the top of scripts/top_performers.py. Three design choices are baked into it: the 3x weighting on comments, the -1 correction on Reddit, and the explicit exclusion of views. Each one is a call about what "learning from your own history" should actually mean.

scripts/top_performers.py (lines 26-35)

From the comment on line 26

"Views deliberately excluded. Viral-by-algorithm is not a pattern worth imitating."

That one line is the design decision the scheduler tools do not make. Views are the signal their dashboards emphasize, because views are what their reporting features are built around. Views tell you the algorithm liked a post. They do not tell you a person did. This is why the composite ignores them.

What the composite actually counts

Comments times three, upvotes plain (minus one on Reddit), and nothing else. Each choice has a reason on top of it in the source.

comments * 3

Real discussion is the strongest imitation signal. A post that sparked 4 replies teaches Claude more than one with 40 passive upvotes. That is why comments are weighted 3x.

upvotes, minus 1 on Reddit

The Reddit OP's self-upvote inflates score returned by the API. GREATEST(0, upvotes - 1) compensates so the rank is not polluted by a post's own author.

views deliberately excluded

Viral-by-algorithm is not a pattern worth copying. A tweet that gets 400k views from one reshare teaches the next draft nothing about what lands. Views do not enter SCORE_SQL.

per-platform floor, not one threshold

PLATFORM_MIN_SCORE = reddit: 10, twitter: 5, linkedin: 3, moltbook: 3, github: 3. Reactions have different scales per platform, so the floor is per-platform.

fallback to general on empty project

If the filter (project and platform) returns zero rows, the helper returns None so the caller can fall back to general posts. The report never comes back blank.

Per-platform floors, not a single threshold

Reactions do not scale the same on every platform. 10 upvotes on Reddit is a working post; 10 reactions on LinkedIn is an excellent post; 10 likes on Twitter is a normal Tuesday. So the score threshold that qualifies a post as a "top" is a per-platform number, not one shared constant.

Reddit min score

0

comments*3 + upvotes - 1

Twitter / X min score

0

comments*3 + likes

LinkedIn min score

0

reactions rarer

GitHub min score

0

engagement sparse

scripts/top_performers.py (lines 37-52)

Why winners cannot contain your own product names

A post that happened to mention "fazm" and went well is a trap. The model will learn "mentioning fazm is a winning move," when in fact the win came from everything around the brand mention. So before the top posts reach the LLM, every one of them is run through has_anti_pattern(). If the post contains any of 9 product names or any URL shape, it is dropped from the report. The function fetches limit * 3 rows so there is headroom to lose a few to the filter and still have a clean top-15.

scripts/top_performers.py (lines 61-93)

What the filter actually blocks

  • A top post containing the word 'fazm' is filtered before the LLM sees it
  • A top post containing 'assrt', 'cyrano', 'terminator', 'pieline', 'mk0r', 's4l', 'vipassana.cool', or 'vipassana-cool' is filtered too
  • A top post containing http://, https://, or www. is filtered
  • The query fetches limit * 3 rows so enough clean winners remain after stripping
  • Rationale is on line 63 of the file: 'Product names that indicate self-promotion (teaching Claude bad habits)'
  • The pre-commit hook protects this block: lines 55-58 say 'DO NOT REMOVE OR SIMPLIFY. These have been reverted by other agents twice already.'

Every loser shows up with a reason attached

The bottom 10 posts are not just listed with a score of zero. They arrive with a "FAILURE" suffix generated by annotate_failure(row), a rule engine with 8 branches. It looks for product names, URLs, phone-capture pitches, mcp-adjacent pitches, excessive question marks, curious_probe-style wording, and short-but-not-punchy content. Everything unmatched falls through to a catch-all: "likely wrong subreddit or off-topic".

scripts/top_performers.py (lines 96-124)

A real report, as the LLM sees it

When you run python3 scripts/top_performers.py --platform reddit the output is three blocks of markdown: style performance, top 15, bottom 10 with reasons. The shell wrapper (engage.sh line 75) captures stdout into a variable and pastes it into the Claude prompt at line 150. Shape below is representative, numbers redacted.

top_performers.py output, Reddit

Where the report plugs into the prompt

One line captures the report. One line pastes it into the heredoc that becomes the model's input. That is the whole wiring. The Claude subprocess sees the scoreboard before it reads the actual pending reply.

skill/engage.sh (lines 75, 150)

Four inputs fan in, three blocks fan out

The report is assembled from four inputs: the posts table, the per-platform floor, the product-name blocklist, and the failure-reason rules. Three blocks come out, and all three land in the prompt.

top_performers.py, one cycle

posts table (status='active')
PLATFORM_MIN_SCORE floor
PRODUCT_NAMES blocklist
annotate_failure rules
composite SCORE_SQL + filters
TOP 15 winners (anti-pattern stripped)
BOTTOM 10 losers (reasons attached)
STYLE performance by avg_cm

The whole cycle, top to bottom

Seven steps. Nothing else happens in between. The last step writes new rows that feed step 2 of the next run, which is where the loop actually closes.

1

launchd fires engage.sh on a 6h cadence

com.m13v.social-engage.plist invokes skill/engage.sh. This is how the whole cycle begins. There is no human pressing a button and no 'schedule' in the Buffer sense.

2

engage.sh shells out to top_performers.py

Line 75 of engage.sh: TOP_REPORT=$(python3 $REPO_DIR/scripts/top_performers.py --platform reddit 2>/dev/null || echo '(top performers report unavailable)'). The query runs over the Neon Postgres posts table.

3

top_performers.py assembles 4 blocks

Style performance (avg_cm desc, avg_up desc per engagement_style), project/platform summary, top 15 posts by composite SCORE_SQL (with has_anti_pattern stripping), and bottom 10 posts with annotate_failure(row) reasons. Returns as stdout markdown.

4

anti-pattern stripping runs before cut-off

get_top_posts fetches limit * 3 rows, then clean = [r for r in rows if not has_anti_pattern(r[5])]. A winning post containing 'fazm' or 'https://' never reaches the LLM, so Claude cannot learn to imitate its own product-mentioning wins.

5

annotate_failure attaches a reason per loser

For every bottom post, the helper builds a ' | '-joined list from 8 rules: mentions product name, contains URL/link, phone-capture pitch, mcp-adjacent pitch, 3+ question marks, curious_probe style on Reddit, under 100 chars, or fallback to 'likely wrong subreddit or off-topic'.

6

engage.sh injects TOP_REPORT into the heredoc

Line 150 of engage.sh, inside the prompt heredoc: ## FEEDBACK FROM PAST PERFORMANCE (use this to write better replies): $TOP_REPORT. The full annotated report becomes a first-class block of the drafting prompt.

7

claude -p drafts the next reply while reading the report

The spawned Claude subprocess sees its own scoreboard inline. It reads which styles drove comments, which posts hit, which posts failed and why, then drafts. The output goes to the browser profile for posting, and the new post row feeds the next run's report.

S4L vs the scheduler listicle, row by row

A calendar-first tool and a feedback-loop tool are not feature-for-feature competitors. They solve different halves of "social media marketing automation." This table is what changes when you trade the first for the second.

FeatureGeneric SaaS schedulerS4L
How the tool improves over timeTeam reads a dashboard on Monday and adjusts the content calendar by handtop_performers.py runs before every engage.sh cycle and rewrites the prompt the model reads next
Ranking signalImpressions, reach, engagement rate (platform APIs, whatever the dashboard shows)Composite SCORE_SQL = comments * 3 + (upvotes - 1 on reddit). Views are excluded on purpose
What the model sees before draftingA tone preset ('professional', 'casual') and a content library the operator filled inAn annotated report: style performance, top 15 wins with their exact text, bottom 10 losses with attached failure reasons
How self-promotion bias is handledProduct / brand guidelines in a doc the operator wrote oncehas_anti_pattern() filters any top post containing one of 9 product names or a URL before the LLM sees it
Failure post-mortemQuarterly review meeting, maybe a shared Notionannotate_failure(row) runs on every bottom post every single cycle and attaches a specific reason string
Per-platform trust thresholdShared across all platforms, usually at the settings levelPLATFORM_MIN_SCORE = reddit: 10, twitter: 5, linkedin: 3, github: 3, moltbook: 3 (per-platform)
Protection against regressionVersion history on the content calendarLines 55-58 of top_performers.py: 'DO NOT REMOVE OR SIMPLIFY. Reverted by other agents twice already. Protected by pre-commit hook.'
Where the loop closesIn a human's head, between Monday and the next content meetingIn Postgres. Every new post row is visible to the very next cycle's top_performers.py query

Want the feedback loop wired into your own posts table?

30 minutes. I walk through the score formula, the anti-pattern filter, and how to inject the report into your own prompt.

Book a call

Frequently asked questions

Frequently asked questions

What makes S4L different from Buffer, Hootsuite, Sprout Social, SocialBee, or Sendible in 2026?

Those tools are calendar schedulers. The product is a grid of time slots you drag content into, plus an optional AI caption generator that uses a shared tone preset. S4L has no calendar. The core primitive is a Postgres posts table and a Python script (scripts/top_performers.py) that re-ranks its own past output against a composite SCORE_SQL on every engage.sh cycle. The output of that script becomes a verbatim block of the next drafting prompt. In other words, the automation the listicles describe is about scheduling; the automation here is about learning.

Why is the score 'comments * 3 + upvotes', not 'views + likes + engagement rate'?

Because comments are the only signal that proves a person stopped scrolling and wrote something in response. Impressions and views measure algorithmic distribution, not what the content taught anyone. The comment at line 26 of top_performers.py is explicit: 'Comments are the strongest imitation signal (real discussion), upvotes are second, views deliberately excluded (viral-by-algorithm != a pattern worth imitating).' The 3x weighting is there because comments are rarer than upvotes on every platform tested.

Why does Reddit get a -1 on upvotes?

Because the OP's own self-upvote inflates the score returned by the Reddit API. Without compensation, every post starts at 1 upvote instead of 0, which pollutes the ranking when a post has no genuine engagement. GREATEST(0, COALESCE(upvotes,0) - 1) on lines 32-33 removes that 1-vote floor on Reddit only. Twitter and LinkedIn do not have the same self-upvote problem, so their score is just COALESCE(upvotes, 0).

Why strip product names from the top posts?

Because a winning post that happened to mention 'fazm' or 'cyrano' would teach the next draft that mentioning those names is a winning pattern. It is not; those wins come from the surrounding context, not the brand mention. PRODUCT_NAMES on lines 62-65 lists the 9 tokens currently filtered: fazm, assrt, pieline, cyrano, terminator, mk0r, s4l, vipassana.cool, vipassana-cool. has_anti_pattern() on lines 83-93 runs a lower() substring check for each, plus a URL check for http://, https://, and www. The function returns True for any hit, and top posts that return True are dropped from the report.

How does the 'top 15 after filtering' still return 15?

get_top_posts on lines 205-247 fetches limit * 3 rows from Postgres, then runs the anti-pattern filter, then slices to limit. With limit=15 that is 45 fetched to get roughly 15 clean. If fewer than 15 survive (a very self-promotional account), the function returns whatever is left. The point is that the LLM never sees the self-promoting winners, even if that means the report is shorter.

What are the 8 failure reasons annotate_failure() attaches?

Lines 96-124: 'mentions <product name>', 'contains URL/link', 'product-adjacent pitch (phone/call capture)', 'product-adjacent (mentions own project)' for mcp-server/desktop-agent/accessibility-api matches, 'too many questions (reads as interrogation)' when content.count('?') >= 3, 'curious_probe style (negative avg on Reddit)' when 'curious' appears with a question mark, 'too short without being punchy' when len(content) < 100, and a fallback 'likely wrong subreddit or off-topic' when nothing else matched. Each bottom post gets its reasons joined with ' | '.

Why are there per-platform floors instead of one threshold?

Because the scale of reactions differs by platform. On Reddit, a post with 10 upvotes and a few comments is noteworthy. On LinkedIn, a post with 3 reactions and 1 comment is also noteworthy. PLATFORM_MIN_SCORE on lines 39-46 sets reddit at 10, twitter and x at 5, linkedin at 3, moltbook at 3, github at 3. A default of 5 covers anything unknown. If no platform is passed, min_score_for(None) returns 5.

Where exactly does the feedback report get injected into the prompt?

skill/engage.sh line 75 captures TOP_REPORT=$(python3 $REPO_DIR/scripts/top_performers.py --platform reddit 2>/dev/null || echo '(top performers report unavailable)'). Line 150 of the same file, inside the big heredoc that becomes the Claude prompt, literally includes '## FEEDBACK FROM PAST PERFORMANCE (use this to write better replies): $TOP_REPORT'. The whole report lands as a top-level block of the drafting context, ahead of the engagement styles block.

Does this actually change what the LLM writes?

It changes two things. First, the style ranking table shifts which archetypes Claude reaches for first. Storyteller with avg_cm=1.7 on Reddit will be picked more often than snarky_oneliner with avg_cm=0.4 because the prompt now shows that gap. Second, the bottom-post reasons teach Claude not to repeat specific failure patterns; after a few cycles where 'curious_probe style (negative avg on Reddit)' appears in bottom-10, the model stops drafting curious_probe on Reddit at all, which is why PLATFORM_POLICY also bans it at the hard level.

Is this separate from the PLATFORM_POLICY hard bans?

Yes. PLATFORM_POLICY in scripts/engagement_styles.py is tone policy, not data policy. It bans curious_probe on Reddit and snarky_oneliner on LinkedIn regardless of what the numbers say. top_performers.py is performance data, not policy. The two layers stack. A style has to clear both: not banned by PLATFORM_POLICY, and surfaced (or demoted) by the live score ranking. This separation is deliberate: policy is about what we want to be seen doing, performance is about what works.

Why is this block under a do-not-simplify warning?

Lines 55-58 of top_performers.py: 'DO NOT REMOVE OR SIMPLIFY THE FUNCTIONS BELOW. These are data-driven improvements based on analysis of 3,000+ posts. They have been reverted by other agents twice already. Protected by pre-commit hook. See CLAUDE.md.' The block got refactored out by AI coding agents who saw 'unused PRODUCT_NAMES' in a static check and removed it. The warning plus the pre-commit hook exist so the block survives future refactors.

Can I use this pattern without running all of S4L?

The pattern is small. A SQL query with a composite score, a per-platform threshold dict, a substring-based anti-pattern filter, and a rule-based failure annotator. If you already log your own posts to a database with an upvotes or reactions field and you run your drafts through an LLM, you can copy this shape verbatim, change the column names, and inject the output into your prompt. The only S4L-specific piece is which engagement styles are tracked; you would pick your own.