Marketing automation for social media that scores threads before you reply, then measures the lift after.
The default story for marketing automation and social media is outbound. Queue 30 posts in a scheduler, fan them out across four channels, look at the dashboard on Friday. S4L runs the opposite loop. It treats the inbound feed as the work queue, scores every thread with a six-term virality formula, and persists a paired t0/t1 snapshot per row so you can read the delta your reply caused. Same product surface, inverted direction.
Same scoring loop, run per platform
The inversion in one sentence
A scheduler turns a calendar slot into a post. S4L turns an inbound thread into a scored row, and only the top-ranked rows ever become posts. Everything in this page sits on top of that inversion.
Outbound scheduler vs inbound scoring
You decide what to say first. The tool decides when. The tool's job is calendar plumbing and channel fan-out.
- Calendar slot is the trigger
- Same content blasted across channels
- Measurement = aggregate weekly numbers
- No per-thread context
How the loop is wired
Five inbound feeds funnel into one score function. The function decides which row is worth the next reply, which then becomes a posted comment and, hours later, a measured delta.
Scoring is the chokepoint
The actual formula
This is not an abstract description; it is the function that runs against every scraped tweet on the way into the candidates table. Six terms, each with an explicit cap or decay, all multiplicative.
Six signals, six different shapes
Each term is calibrated for a different failure mode. Velocity catches the breakout, the bonuses prevent over-rewarding raw size, and the decay deletes anything stale before it pollutes the queue.
Engagement velocity
Total engagements divided by age in hours. The strongest single predictor in the model. A tweet at 200 likes after 30 minutes outranks 600 likes after 8 hours.
Reply bonus, capped at 4x
min(replies / 15, 4.0). 15 replies = +1x, 30 = +2x, 60+ = +4x. Active discussions surface your reply higher.
Discussion ratio, capped at 1x
min((replies / likes) * 10, 1.0). A 0.1 reply-to-like ratio means real argument, not one-way broadcast.
Author reach multiplier
5K to 50K followers gets 1.0x, 50K to 200K gets 1.4x. Mega accounts cap at 1.1x because the comment competition gets brutal above 500K.
Retweet ratio bonus, up to 2x
1.0 + min(rt_ratio * 2, 1.0). When the audience is resharing, the post is still spreading. Your reply rides that distribution.
6-hour half-life decay
exp(-0.1155 * age_hours). 3h keeps 71% of the score, 6h keeps 50%, 12h falls to 25%, 18h to 12.5%. Anything older than 18h is dropped from the queue entirely.
Anchor fact
The age decay is math.exp(-0.1155 * age_hours), a six-hour half-life chosen by production tuning, not by hand.
The previous version used a 3-hour half-life, which deleted slow-burn threads before their second wave. The current value keeps a tweet at 71% after 3 hours, 50% after 6, 25% after 12, and 12.5% after 18. The hard cutoff at 18 hours is a separate filter: if age_hours > 18: skipped += 1; continue. That single line is why the candidates table never accumulates dead rows.
The decay table, in numbers
Every score the function returns is rounded to two decimals. These four numbers are what the decay term contributes at each age threshold; multiply them by the rest of the formula and you have the actual queue ranking.
The 18-hour cutoff is enforced before the score is even computed. Anything older never enters the table, so a slow-burn thread that spikes at hour 17 still has one chance to be the top row.
How the system knows your reply did anything
The candidates table carries paired snapshot columns. likes_t0 is written at discovery and never touched again. likes_t1 is written by a separate sweep that runs hours later. delta_score is the difference. None of this is a derived analytics view; it is a row in the same table the scoring function writes to.
One candidate, end to end
Trace one tweet from the moment a scraper sees it to the moment its delta_score lands. Seven stops, none of them in a calendar.
Scrape returns raw thread JSON
scan_twitter_mentions_browser.py and find_threads.py emit candidate JSON from a logged-in CDP session. No scoring yet, just raw fields.
Score and dedup
score_twitter_candidates.py runs calculate_virality_score on each candidate, drops anything > 18h old, and skips URLs already present in the posts table.
Persist with t0 snapshot
Surviving rows are upserted into twitter_candidates with status='pending'. likes_t0 through bookmarks_t0 are written in the same insert and never updated again.
Pick a target by virality_score
pick_thread_target.py orders by virality_score DESC, applies project-fit filters, and emits one row per cycle. The agent picks up that row.
Engagement style is selected
engagement_styles.py chooses one of 7 styles based on platform + matched_project + topic. The chosen style is the prompt's spine.
Reply is composed and posted
twitter_browser.py drafts in the open chrome session, posts, and writes the thread URL into posts so the dedup set is updated for future scoring runs.
Hours later, a T1 sweep
A scheduled re-poll updates likes_t1 through bookmarks_t1 on the same row, computes delta_score, and the candidate is finally marked 'measured'.
What the score gates
Drafting and posting are the expensive steps: tokens, browser time, and a finite reputation budget per platform. The score sits in front of all of them as a cheap pre-filter that runs on JSON the scraper already returned.
Score-first reply pipeline
Inbound scrape
Logged-in CDP returns thread JSON for the configured queries.
Dedup against posts
Any thread_url already in posts is dropped before scoring.
calculate_virality_score()
Six multiplicative terms; sub-threshold rows are insert-only with no agent action.
pick_thread_target.py
Top-ranked row matches a project, then becomes the next reply target.
Reply via browser MCP
twitter_browser.py drafts in the live session, posts, and writes thread_url into posts.
Re-poll for t1, compute delta
Same row gets likes_t1...bookmarks_t1 and delta_score updated. Loop closes.
Versus a normal scheduler
The contrast is not feature-by-feature; it is loop-by-loop. A scheduler optimizes the calendar. S4L optimizes which inbound row gets the next reply.
| Feature | Outbound scheduler | S4L score-first loop |
|---|---|---|
| Where the work starts | Empty calendar slot at 9am | Inbound thread that crossed a virality threshold in the last 6 hours |
| What gets scored | Past performance of your own posts | Other people's threads, ranked by velocity * reach * decay * bonuses |
| What the model rejects | Posts outside the brand voice template | Tweets older than 18h, threads you already replied to, sub-1000-follower authors with low velocity |
| Measurement primitive | Aggregate weekly impressions and CTR | Per-thread t0/t1 paired columns plus a delta_score per row |
| What 'success' means | Posts went out on schedule | delta_score is positive after your comment landed |
| Cadence safety | Rate limits inside the scheduler | Shared cooldown file at /tmp/linkedin_cooldown.json that cron checks before any action |
| Dedup | Manual content calendar review | SELECT thread_url FROM posts WHERE platform='twitter' is consulted before every insert |
Run the same loop on your own feeds
S4L is open source. The score function, the t0/t1 schema, the cooldown file, and the platform-specific browser scripts all live in one repo.
See S4L →Why this is now possible at all
0h
half-life on the decay
Up from 3h. Slow-burn threads now survive long enough to reach the top of the queue if their numbers compound.
0
engagement styles
critic, storyteller, pattern_recognizer, curious_probe, contrarian, data_point_drop, snarky_oneliner. Picked after the score, never before.
0
paired t0/t1 columns
likes, retweets, replies, views, bookmarks. Same row, snapshot at discovery, snapshot after the reply settles.
“Drafting a thoughtful reply costs LLM tokens, browser session time, and a finite reputation budget per platform. The score is a cheap pre-filter that decides whether to spend any of that.”
from the scoring function's design notes
Questions worth answering
What does marketing automation for social media usually mean, and what does S4L do differently?
The standard meaning is outbound: you queue posts in a scheduler (Hootsuite, Sprinklr, Zoho, Make) and the tool fires them at chosen times across multiple channels. S4L runs the inverse pipeline. It scrapes inbound threads, scores each one with a virality formula in scripts/score_twitter_candidates.py, and only then does a generation step decide what to reply with. The scheduler picks a time; S4L picks a thread.
What is the actual scoring formula?
score = velocity * reach_mult * age_decay * rt_bonus * (1 + reply_bonus) * (1 + discussion_bonus). Velocity is total engagements divided by age in hours. age_decay = math.exp(-0.1155 * age_hours), which is a 6-hour half-life. reply_bonus caps at 4x at 60+ replies. discussion_bonus caps at 1x at a 0.1 reply-to-like ratio. reach_mult tops out at 1.4x for 50K to 200K follower accounts.
Why a 6-hour half-life specifically?
The earlier version used 3h, which deleted slow-burn threads from the queue before their second wave hit. 6h keeps a tweet at 50% of its peak score after 6 hours, 25% at 12h, and 12.5% at 18h. Threads older than 18h are filtered out entirely with `if age_hours > 18: skipped += 1; continue`. The softer decay lets a 'slow banger' beat a 'fresh dud'.
What are the t0 and t1 columns in twitter_candidates?
Paired snapshots. likes_t0 / retweets_t0 / replies_t0 / views_t0 / bookmarks_t0 are written when the candidate is first discovered. likes_t1 / retweets_t1 / replies_t1 / views_t1 / bookmarks_t1 are written hours later by a second pass that re-fetches the same tweet. delta_score is computed from the difference. This is how the system knows whether a reply lifted the thread or rode a corpse.
What stops the system from spamming a platform after a rate limit?
A shared cooldown file at /tmp/linkedin_cooldown.json. The cron job runs `python3 scripts/linkedin_cooldown.py check` first; exit 1 means in cooldown and the run aborts. Cooldown reasons are stored verbatim ('429 rate limit', 'account restricted'), and the resume_after timestamp is checked against the current UTC time on every read. The file is removed automatically once the timestamp passes.
How does dedup work, and why is it part of the scoring step?
Before any candidate is scored, the upserter pulls every thread_url from posts WHERE platform='twitter' AND thread_url IS NOT NULL into a Python set. Any candidate whose URL is already in that set is dropped. The scoring loop runs against fresh URLs only. This is why the system can re-scan the same query feed every hour without reposting on the same thread.
What happens to a candidate that scored well but never got engaged?
Two things, on different schedules. The age_decay term keeps shrinking the row's effective score until either (a) the agent's batch loop picks it up while it's still ranked highly, or (b) the 18-hour filter drops it. Meanwhile, an expire pass marks rows older than 12 hours as 'expired' and prunes posted/expired rows older than 7 days, so the candidates table never grows unbounded.
Why score before replying instead of just posting and measuring after?
Cost asymmetry. Drafting a thoughtful reply costs LLM tokens, browser session time, and a finite reputation budget on each platform. Spending those on a thread that has no audience is the opportunity-cost equivalent of broadcasting into an empty room. The score is a cheap pre-filter (pure math on JSON the scrape already returned) that decides whether to spend the expensive resources at all.
How does this interact with engagement styles?
Once a candidate clears the score threshold, a separate module (engagement_styles.py) picks one of seven named styles for the reply: critic, storyteller, pattern_recognizer, curious_probe, contrarian, data_point_drop, snarky_oneliner. Each style has a 'best_in' map per platform, so the matched_project plus the post topic plus the chosen style determine the prompt. Style is selected after scoring, never before.
Stop scheduling. Start scoring.
Marketing automation and social media stops being calendar plumbing the moment the inbound feed becomes the work queue. S4L is the open implementation of that idea.
Open S4L