The best social media automation tools on the RecurPost listicles all ship the same thing. S4L ships the one piece they don't.

Q: What exactly is the scoring formula S4L uses?

SCORE_SQL in scripts/top_performers.py (lines 30-35) evaluates to COALESCE(comments_count,0) * 3 + (for Reddit) GREATEST(0, upvotes - 1) or (for every other platform) COALESCE(upvotes,0). Comments are weighted three times higher than upvotes because a reply takes real effort and is the strongest signal that a post earned attention, while an upvote is one click. Reddit gets upvotes minus 1 because the Reddit API returns a post score that includes the OP's own self-upvote, so a comment that drew zero engagement still reads as score=1 unless you subtract it. Views are deliberately excluded from the formula; a comment that goes viral because of an algorithm bump is not a pattern worth imitating.

Q: Why are the per-platform min-score thresholds different?

PLATFORM_MIN_SCORE at lines 39-46 sets Reddit=10, Twitter/X=5=3, Moltbook=3, GitHub=3. reactions are rarer than Reddit upvotes, so holding to the Reddit bar would mean almost no post ever qualifies as a 'top performer' and the feedback loop starves. Tuning the floor per platform is what lets a comment that pulled 3 reactions and 1 reply still count as a teachable top performer while a Reddit comment at the same raw number gets correctly filtered as noise.

Q: How is this actually different from RecurPost or a RecurPost alternative?

RecurPost and every tool on its listicles schedules content you wrote. The unit is the post in the library, and the product's job is to rotate it onto the calendar. S4L's unit is the reply to somebody else's thread. Before writing each new reply it runs SCORE_SQL over the last month of posts in a Postgres table and feeds the top performers plus the annotated failures to Claude as system context. A scheduler does not need to read its own engagement data because its job is not to decide what to say next; S4L's job is exactly that.

Q: What does the FAILURE REASON annotator actually tag?

annotate_failure() at lines 96-124 applies seven rules to every bottom-quartile post: mentions of product names in PRODUCT_NAMES (fazm, assrt, pieline, cyrano, terminator, mk0r, s4l, vipassana.cool), presence of http:// or https:// or www., phone-call pitch phrases like 'missed call' or 'answering service', macOS-app adjacency phrases like 'accessibility api' or 'mcp server', having three or more question marks, any comment tagged 'curious' with a question mark (the curious_probe style has a negative Reddit average), and raw length under 100 characters. If none match, the reason is stamped as 'likely wrong subreddit or off-topic'. The LLM then reads these labels and the corresponding text as a do-not-copy set.

Q: Why is the function block marked 'DO NOT REMOVE OR SIMPLIFY'?

Lines 54-59 carry a six-line comment block warning agents not to touch the SCORE_SQL logic or the annotate_failure rules, with the note 'reverted by other agents twice already. Protected by pre-commit hook. See CLAUDE.md.' Multiple agents, including earlier Claude sessions, have tried to simplify the composite score back to raw upvotes or rewrite annotate_failure to be less opinionated. The pre-commit hook now blocks the diff when those specific lines change. The warning comment is the in-band explanation for the hook.

Q: Does the feedback loop actually reach the draft, or does it just sit in a log?

It reaches the draft. Each per-platform shell wrapper (skill/run-reddit-search.sh, skill/run-twitter-cycle.sh, skill/) calls top_performers.py and interpolates the result into the prompt that launches a child `claude -p --strict-mcp-config --mcp-config -agent-mcp.json` process. The child sees the feedback report, the style tier report, and the llms.txt of the target product in its initial context. The Playwright MCP browser does not even start until after that prompt is assembled.

Every alternative on a RecurPost roundup is a scheduler. They queue content you wrote, rotate it on a cadence, and call that automation. None of them read their own engagement back before writing the next post. S4L does, and the whole thing lives in one file: scripts/top_performers.py.

Matthew Diakonov, Written with AI

Published April 20, 20269 min read

4.8from operators running S4L on live accounts

Feedback report generated from Postgres before every reply draft

Comments weighted 3x more than upvotes in the scoring query

Every bottom performer annotated with a specific FAILURE REASON label

The RecurPost gap

A scheduler cannot grade its own output

RecurPost queues library content on a cadence.

S4L writes a reply, then reads the reply's engagement back.

Comments x 3 + upvotes, Reddit -1, views excluded.

The bottom posts get a FAILURE REASON tag.

That report becomes the system prompt for the next draft.

0:00 / 0:12

What RecurPost-style tools automate, in one sentence

You sit down once, fill a library with posts you want circulating, assign them to queue categories, pick a cadence per category, and the tool rotates them into the calendar forever. The best ones add AI captions, an RSS feed importer, and bulk scheduling from a CSV. The unit of work is the post you wrote. The job of the tool is the rotation.

That mental model is fine when the content is yours and evergreen. It breaks down the moment the content is a reply to a stranger. Nobody can prewrite a library of replies to threads that don't exist yet. And nothing in the RecurPost category scores the replies it has already written to decide what to do differently on the next one.

The piece S4L ships that none of them do

The piece is one Python file, scripts/top_performers.py, and the spine of it is a SQL expression defined at lines 30 to 35.

scripts/top_performers.py

Comments are tripled because a reply takes effort. Upvotes get counted at face value except on Reddit, where the OP's own self-upvote gets subtracted so a post sitting at score=1 with zero engagement correctly reads as zero. Views are nowhere in the formula, because viral-by-algorithm is not a pattern worth copying. Every downstream query (get_top_posts, get_bottom_posts, get_style_performance) reuses this expression as a single source of truth.

reddit=10twitter=5=3moltbook=3github=3comments x 3upvotes -1 (Reddit)views excludedSCORE_SQL = 0 -> bottomhas_anti_pattern filter

What makes this feedback loop uncopyable

Composite score, not raw upvotes

Comments weight 3x because a reply is real discussion, an upvote is a click. Views excluded entirely; viral-by-algorithm is not a pattern worth copying.

Reddit upvotes -1

GREATEST(0, upvotes - 1) subtracts the OP's own self-upvote so a post sitting at score=1 with zero replies correctly reads as zero engagement.

Per-platform floors

reddit=10, twitter=5, =3, moltbook=3, github=3. reactions are scarce, Reddit upvotes inflate, so the same composite gets graded differently.

FAILURE REASON on every bottom post

annotate_failure() tags the exact pattern that killed each post: 'contains URL/link', 'curious_probe on Reddit', 'mentions own project', 'too many questions'. The LLM reads the tag and avoids the pattern.

3,000 posts of history

The protected comment block at lines 54-59 cites the sample size: 'data-driven improvements based on analysis of 3,000+ posts'. The revert count, also documented, is 2.

Feedback report pipes into the draft

top_performers.py is called by every platform's run-*.sh before the browser MCP even spins up. The output becomes the system prompt for the Claude subprocess that writes the next comment.

Where the report ends up (hint: the LLM's system prompt)

The scoring query is only useful because it feeds a specific process. Every per-platform cron calls top_performers.py before a browser even opens, then glues the output into the context of a child Claude subprocess that writes the reply.

Where SCORE_SQL ends up on every cron tick

What happens on every single cron tick

launchd fires the per-platform cron

Example: com.m13v.social-twitter-cycle wakes up every six hours. It runs skill/run-twitter-cycle.sh.

run-*.sh builds context

The shell wrapper sources .env, acquires a per-platform lock, picks a project with the weight-deficit algorithm in pick_project.py, then calls scripts/top_performers.py --platform twitter --project <picked>.

top_performers.py runs SCORE_SQL

The composite score query pulls the top 5 posts above the platform floor and the bottom 5 posts sitting at zero engagement. annotate_failure() stamps each bottom post with the likely failure pattern.

The report becomes the system prompt

run-*.sh spawns a child `claude -p --strict-mcp-config --mcp-config twitter-agent-mcp.json` with the report and the style tier report glued into the instructions. No scheduler in the RecurPost category does this step.

Claude drafts, posts, logs back to Postgres

The comment is written knowing which patterns earned comments and which ones got stamped with a FAILURE REASON last cycle. Twelve hours later, scan_*.py reads engagement back into the same posts table and the loop closes.

The bottom-post annotator (the part nobody copies because it's opinionated)

Most scoring systems stop at ranking the winners. This one spends as much code on the losers. Every bottom post runs through annotate_failure which tags the specific pattern that likely killed it, so the LLM reads a do-not-copy set with reasons, not just text.

scripts/top_performers.py

The failure-gate you might miss on first read

A subtle but load-bearing detail lives in get_bottom_posts. The failure threshold is SCORE_SQL = 0, not upvotes < 1. The older, naive filter would have missed every Reddit post sitting at upvotes=1 with no comments, because the OP self-upvote clears that check. Switching to the composite closes the loop.

scripts/top_performers.py

What the report actually looks like by the time it hits Claude

Here is a redacted sample of the report that top_performers.py --platform reddit --project S4L produces right before a reply draft. Style stats come from the live posts table, the top posts are ranked by SCORE_SQL, and the bottom posts are stamped with their FAILURE REASON.

feedback_report.txt

0xComments weight over upvotes

0Reddit min-score floor

0 min-score floor

~0+Posts the heuristics were tuned against

S4L vs a RecurPost-style scheduler on the parts that matter

Feature	RecurPost-style scheduler	S4L
What does 'best post' mean?	Whatever you manually marked to recycle in an evergreen library.	Whatever ranked highest on SCORE_SQL last time top_performers.py ran.
Score formula	None. Posts rotate on a fixed schedule regardless of performance.	comments x 3 + upvotes, Reddit upvotes -1 to strip OP self-upvote.
Platform-aware thresholds	Same rules for Twitter, Reddit. The library does not care.	reddit=10, twitter=5, =3. Different reaction floors per platform.
Bottom performers	Marked 'inactive' in the library and forgotten. No reason recorded.	Fed to the LLM with an annotate_failure label so the next draft steers around the same failure mode.
Self-promotion filter	None. The scheduler will keep recycling a promo-heavy caption forever.	has_anti_pattern() strips top posts containing product names or URLs before they become examples.
Revert protection	Not applicable.	Pre-commit hook plus a DO NOT REMOVE comment block because two agents already tried to simplify it.

The revert history is in the source file

One odd thing you notice reading scripts/top_performers.py is that the composite-score block is wrapped in a six-line do-not- touch comment. The reason is simple: both the SCORE_SQL expression and the annotate_failure rules have already been reverted by two different agents who decided they looked over-engineered. The current file carries the revert count, the tuning sample size, and a pointer to the pre-commit hook that now blocks further simplification.

scripts/top_performers.py

“Reverted by other agents twice already. Protected by pre-commit hook.”

scripts/top_performers.py line 58

Try it yourself, with a real posts table

If you already have social-autoposter installed and at least a week of posts in your Postgres DB, run the script against your own data. The top-five table by composite score tends to look nothing like the top-five table by raw upvotes.

A real run on a populated posts table

What a feedback loop buys you over a scheduler

What shifts when the tool reads its own output

The style distribution drifts toward what actually earns replies, not what you personally like writing.
Low-floor platforms (GitHub) stop starving the feedback loop because their min-score threshold is set lower.
A post that mentions a product name or a link gets filtered out of the 'good example' set even if it went viral, so the next draft does not copy the bad habit.
The bottom quartile becomes a labeled training set, not a spreadsheet of shame.
Style tiers re-rank on every run, so a style that cools off gets demoted without you editing a config file.

If you keep scoring by upvotes, here is what you miss

The simpler filter ORDER BY upvotes DESC correlates well with dopamine and poorly with discussion. Over 3,000 posts the style that won on raw upvotes was not the style that won on comments, and comments are the signal that maps to actual people choosing to engage back. If your reply pipeline tunes on upvotes alone, you drift toward emotionally resonant one-liners and away from the longer, specific posts that drag people into conversation. Cube the wrong metric long enough and the tool quietly becomes a karma farm for you and a notification machine for the people who see it.

Want to see your own posts table graded by SCORE_SQL?

Hop on a call. I'll run top_performers.py against your Postgres instance and walk through what the report is telling you to change in the next draft.

Frequently asked questions

What exactly is the scoring formula S4L uses?

SCORE_SQL in scripts/top_performers.py (lines 30-35) evaluates to COALESCE(comments_count,0) * 3 + (for Reddit) GREATEST(0, upvotes - 1) or (for every other platform) COALESCE(upvotes,0). Comments are weighted three times higher than upvotes because a reply takes real effort and is the strongest signal that a post earned attention, while an upvote is one click. Reddit gets upvotes minus 1 because the Reddit API returns a post score that includes the OP's own self-upvote, so a comment that drew zero engagement still reads as score=1 unless you subtract it. Views are deliberately excluded from the formula; a comment that goes viral because of an algorithm bump is not a pattern worth imitating.

Why are the per-platform min-score thresholds different?

PLATFORM_MIN_SCORE at lines 39-46 sets Reddit=10, Twitter/X=5=3, Moltbook=3, GitHub=3. reactions are rarer than Reddit upvotes, so holding to the Reddit bar would mean almost no post ever qualifies as a 'top performer' and the feedback loop starves. Tuning the floor per platform is what lets a comment that pulled 3 reactions and 1 reply still count as a teachable top performer while a Reddit comment at the same raw number gets correctly filtered as noise.

How is this actually different from RecurPost or a RecurPost alternative?

RecurPost and every tool on its listicles schedules content you wrote. The unit is the post in the library, and the product's job is to rotate it onto the calendar. S4L's unit is the reply to somebody else's thread. Before writing each new reply it runs SCORE_SQL over the last month of posts in a Postgres table and feeds the top performers plus the annotated failures to Claude as system context. A scheduler does not need to read its own engagement data because its job is not to decide what to say next; S4L's job is exactly that.

What does the FAILURE REASON annotator actually tag?

annotate_failure() at lines 96-124 applies seven rules to every bottom-quartile post: mentions of product names in PRODUCT_NAMES (fazm, assrt, pieline, cyrano, terminator, mk0r, s4l, vipassana.cool), presence of http:// or https:// or www., phone-call pitch phrases like 'missed call' or 'answering service', macOS-app adjacency phrases like 'accessibility api' or 'mcp server', having three or more question marks, any comment tagged 'curious' with a question mark (the curious_probe style has a negative Reddit average), and raw length under 100 characters. If none match, the reason is stamped as 'likely wrong subreddit or off-topic'. The LLM then reads these labels and the corresponding text as a do-not-copy set.

Why is the function block marked 'DO NOT REMOVE OR SIMPLIFY'?

Lines 54-59 carry a six-line comment block warning agents not to touch the SCORE_SQL logic or the annotate_failure rules, with the note 'reverted by other agents twice already. Protected by pre-commit hook. See CLAUDE.md.' Multiple agents, including earlier Claude sessions, have tried to simplify the composite score back to raw upvotes or rewrite annotate_failure to be less opinionated. The pre-commit hook now blocks the diff when those specific lines change. The warning comment is the in-band explanation for the hook.

Does the feedback loop actually reach the draft, or does it just sit in a log?

It reaches the draft. Each per-platform shell wrapper (skill/run-reddit-search.sh, skill/run-twitter-cycle.sh, skill/) calls top_performers.py and interpolates the result into the prompt that launches a child `claude -p --strict-mcp-config --mcp-config <platform>-agent-mcp.json` process. The child sees the feedback report, the style tier report, and the llms.txt of the target product in its initial context. The Playwright MCP browser does not even start until after that prompt is assembled.

Can I use this with a tool from a RecurPost alternatives list like Buffer, Later, or Sprout Social?

Only as two separate layers. Those tools still solve scheduling: queue an evergreen post, recycle it on a cadence. S4L lives one layer above: for conversational replies, where 'what to say' is the hard part, not 'when to post'. The Postgres posts table and the feedback report do not need Buffer or RecurPost to run; they just need a script that posts and a script that scans engagement 12 hours later and writes it back. The two categories do not compete so much as occupy different rungs of the automation stack.

Related deep dives

Keep reading

Comparison

Social media automation tools vs RecurPost: the reply-style tiering

The 8 engagement styles in scripts/engagement_styles.py and the 60/30/10 tier split driven by AVG(upvotes) GROUP BY style.

Read

Deep dive

Social media automation recurpost: what a scheduler cannot do

Where the RecurPost mental model (library + cadence) breaks down once you stop scheduling and start replying.

Read

Guide

Best social media automation tools

What 'best' means once you grade the tool on whether it reads its own engagement back into the next draft.

Read

What RecurPost-style tools automate, in one sentence

The piece S4L ships that none of them do

What makes this feedback loop uncopyable

Composite score, not raw upvotes

Reddit upvotes -1

Per-platform floors

FAILURE REASON on every bottom post

3,000 posts of history

Feedback report pipes into the draft

Where the report ends up (hint: the LLM's system prompt)

Where SCORE_SQL ends up on every cron tick

What happens on every single cron tick

launchd fires the per-platform cron

run-*.sh builds context

top_performers.py runs SCORE_SQL

The report becomes the system prompt

Claude drafts, posts, logs back to Postgres

The bottom-post annotator (the part nobody copies because it's opinionated)

The failure-gate you might miss on first read

What the report actually looks like by the time it hits Claude

S4L vs a RecurPost-style scheduler on the parts that matter

The revert history is in the source file

Try it yourself, with a real posts table

What a feedback loop buys you over a scheduler

If you keep scoring by upvotes, here is what you miss

Want to see your own posts table graded by SCORE_SQL?

Frequently asked questions

Keep reading

Social media automation tools vs RecurPost: the reply-style tiering

Social media automation recurpost: what a scheduler cannot do

Best social media automation tools

Comments (••)

Comments ()