s4l.ai / guide / social media marketing automation

Social media marketing automation where the tone is a live bandit, not a template.

Every guide on the first page of Google covers the same primitive: schedule a post, draft a caption, suggest hashtags, auto-reply to DMs. S4L automates the thing none of them touch. Before every comment, it runs a Postgres query against the posts table, ranks 7 named tones by live average upvotes on that specific platform, and splits the trusted tones into thirds so the best one fires about 60% of the time and the worst about 10%.

M
Matthew Diakonov
11 min read
4.9from 47
7 named comment tones, re-tiered every draft
MIN_SAMPLE_SIZE = 5 before any tone is 'trusted'
PLATFORM_POLICY hard-bans curious_probe on Reddit, snarky_oneliner on LinkedIn
Pleaser/validator is an explicit anti-style in the prompt

The 7 tones the bandit chooses from

criticstorytellerpattern_recognizercurious_probecontrariandata_point_dropsnarky_onelinerrecommendation (reply-only)

Defined in scripts/engagement_styles.py as the STYLES dict. The drafting prompt is conditioned on the chosen tone, not on a generic "write a comment". Reply pipelines get one extra voice, "recommendation", governed by a tier-independent link strategy capped at 20% of replies.

What every other automation guide misses

Read the top 10 Google results for this keyword and you will get a consistent mental model: automation is about repetition. You have a content calendar, you schedule the posts once, AI helps you write the captions, chatbots cover the DMs, analytics tells you what worked. The automation sits around a human-drafted calendar and stretches it.

The problem that mental model ignores: when you actually engage in comments and threads, the copy you write is the product. Scheduling the right post on the wrong day costs you a slot; writing in the wrong voice costs you the reader. None of the tools on the first SERP page model voice selection as a control loop. They model it as a template library.

S4L treats voice as a bandit arm. Seven named tones, a fresh Postgres read before every draft, a hard minimum sample size before any tone is trusted, and a tone-policy override that can beat performance data. That loop is the angle of this page, and it lives in a single file: scripts/engagement_styles.py.

The anchor fact: get_dynamic_tiers()

This function is the bandit. It reads candidates, filters by platform policy, splits trusted tones into thirds, and hands the three lists (dominant, secondary, rare) to the drafter. Usage guidance is in the prompt itself: dominant ~60%, secondary ~30%, rare ~10%.

scripts/engagement_styles.py

The block that matters is the third-split at the bottom: top third becomes dominant, bottom third becomes rare, middle plus all untrusted tones form secondary. With only 1 or 2 trusted tones the cut degenerates safely so dominant holds everything trusted and rare is empty.

0named engagement tones
0min samples before a tone is trusted
0tiers: dominant, secondary, rare
0platforms each with its own tier table

The live Postgres read that feeds the bandit

Competitors treat performance as a reporting artifact (you read the dashboard, you change the templates). S4L treats it as an input to the next draft. _fetch_style_stats runs this SELECT every time get_dynamic_tiers is called, which means every time a comment or reply is drafted.

scripts/engagement_styles.py

The filter is worth reading line by line. status='active' excludes deleted and removed rows. LENGTH(our_content) >= 30 drops test drafts and low-effort replies that would otherwise skew the average. upvotes IS NOT NULL lets posts too new to have been scored skip the aggregation rather than count as zeros.

How the whole loop hangs together

Four inputs collapse into one re-tiered style block every draft, then fan out to five separate platform drafters. The hub of the diagram is not a service, it is the prompt.

How tone automation wires up

posts table
STYLES dict
PLATFORM_POLICY
MIN_SAMPLE_SIZE = 5
get_styles_prompt()
reddit drafter
twitter drafter
linkedin drafter
github drafter
moltbook drafter
0 overrides

Platform policy is not a performance judgment. It is a tone and brand constraint. Even if the data showed high upvotes, we still do not want this style.

scripts/engagement_styles.py, PLATFORM_POLICY comment block

Where policy beats performance

Every arm on the bandit is pre-filtered by PLATFORM_POLICY before any math runs. A tone can be the highest-upvote voice in the dataset and still not appear in the prompt if it violates the platform's tone constraint. This is the part of the loop that is deliberately not data-driven.

scripts/engagement_styles.py

What happens when one comment is drafted

1

Read PLATFORM_POLICY for this platform

If the platform has a 'never' list (curious_probe on Reddit, snarky_oneliner on LinkedIn or GitHub), those styles are removed from the candidate pool first, before any performance math runs. Tone policy beats performance.

2

Run a fresh SELECT on the posts table

_fetch_style_stats queries Postgres live, every draft. It filters to status='active', engagement_style IS NOT NULL, our_content length >= 30, and upvotes IS NOT NULL. Then it groups by engagement_style and returns n and avg_up.

3

Split candidates into trusted and explore

A style is trusted only if its n is at least MIN_SAMPLE_SIZE (5). Any style with fewer than 5 active samples is pushed into the 'explore' list so the model keeps trying it regardless of noisy early numbers.

4

Sort trusted tones by avg_upvotes, descending

Highest average on this platform goes first. The sort is stable against the same platform's rows only; a tone that wins on Twitter has zero influence on its Reddit tier.

5

Slice trusted into thirds

If there are enough trusted tones, top third becomes 'dominant' (target usage ~60%), bottom third becomes 'rare' (~10%), middle is 'secondary' (~30%). With only 1 or 2 trusted tones, all of them go to dominant and rare is empty.

6

Append explore tones to secondary

Untrusted tones join 'secondary' so they continue to get drafted roughly 30% of the time. This is how a new style escapes the cold-start trap without being declared good on insufficient evidence.

7

Inject the tiered style block into the Claude prompt

get_styles_prompt builds a multi-section markdown block: PRIMARY / SECONDARY / RARE with each tone's description, example, best-in list, and safety note. The drafter sees the tiers every single run; the tiers can flip overnight based on what posted well yesterday.

The 7 tones, plus the anti-style

Each tone has a description, a one-line example, a per-platform best_in list, and a short safety note. The anti-style is not one of the 7; it is the failure mode the prompt tells the model to actively avoid.

critic

Point out what is missing, flawed, or naive. Reframe the problem. NEVER just nitpick; offer a non-obvious insight. Plays in r/Entrepreneur, r/smallbusiness, r/startups; LinkedIn only with softer framing.

storyteller

First-person narrative with concrete details. Lead with failure or surprise, not success. Never pivot to a product pitch. Strong in r/startups, r/Meditation, founder-story Twitter, LinkedIn career posts.

pattern_recognizer

Name the pattern or phenomenon. Authority through pattern recognition, not credentials. Best in r/ExperiencedDevs, r/programming, r/webdev and dev Twitter.

curious_probe

One specific follow-up question about the most interesting detail. ONE question only, never a list. Hard-banned on Reddit regardless of upvote performance.

contrarian

Take a clear opposing position backed by experience. Must have credible evidence attached; empty hot takes get destroyed. Works in r/Entrepreneur, r/ExperiencedDevs, industry-debate Twitter.

data_point_drop

Share one specific, believable metric. Let the number do the talking. No links. Numbers must be believable, not impressive. Strong on r/SaaS and growth Twitter.

snarky_oneliner

Short, sharp, emotionally resonant observation. One sentence max. Hard-banned on LinkedIn and GitHub. NEVER on small or serious subs like r/vipassana.

pleaser/validator (anti-style)

'this is great', 'had similar results', '100% agree'. This is NOT one of the 7 styles. The prompt explicitly tells the model to AVOID this voice because it has the lowest average engagement across every platform.

S4L vs generic social media marketing automation

FeatureGeneric scheduler + caption AIS4L
What gets automatedWhen to post, what caption, which hashtagsWhich comment TONE to use next, per platform, per thread
Where the style comes fromA template library you filled inLive SELECT on posts grouped by engagement_style and platform
How a new tone earns trustYou turn it on in the UIMinimum 5 samples at status='active' before its avg_upvotes is trusted
How performance changes usageManual editing of template weightsTrusted tones are sorted by avg_upvotes and split into thirds: top third fires ~60%, middle ~30%, bottom third ~10%
How brand safety overrides dataOpaque, depends on vendorPLATFORM_POLICY 'never' list excludes a tone even if it was winning. Curious_probe is banned on Reddit. Snarky_oneliner is banned on LinkedIn and GitHub.
What the drafter seesA blank editor and a calendarA tiered style block injected into the Claude prompt with live rankings explained in-line
Content length contractPlatform-native limits onlyDrafts where our_content length < 30 chars are excluded from the bandit table entirely
Anti-styleNot modeledThe prompt tells the model to AVOID the pleaser/validator voice, with the justification explicit in text

By the numbers, straight from the file

0

named tones in the STYLES dict

0

MIN_SAMPLE_SIZE before a tone's avg_upvotes is trusted

0%

usage target for tones in the top-third dominant tier

0

minimum chars of our_content before a row enters the bandit

Who this matters to

If you use social media for announcements, use Buffer. A scheduler is enough because the content is already decided. If you use social for engagement, where the reply is the asset and the voice is what wins attention, your bottleneck is voice, not time. That is the automation gap S4L fills.

Indie operators running multiple products feel this first. You cannot manually pick a tone for every comment across 5 platforms and 8 subreddits, and you definitely cannot track which tone is working where by reading a dashboard. A live bandit over 7 named tones with hard platform policies is the minimum version of that automation that does not collapse into "write something about X" and pray.

Frequently asked questions

How is this different from the social media marketing automation tools on the top Google results?

Buffer, Hootsuite, Sprout Social, Sendible, HubSpot, and Gumloop all automate scheduling, caption drafting, hashtag suggestions, and autoresponders. None of them automate the choice of comment tone based on live performance data. S4L does exactly that in scripts/engagement_styles.py. Before every draft, a SELECT against the posts table returns avg_upvotes per engagement_style for the target platform, the trusted styles are split into thirds, and the tiered block is injected into the Claude prompt. The automation primitive is the tone, not the schedule.

What are the 7 tones and why exactly 7?

critic, storyteller, pattern_recognizer, curious_probe, contrarian, data_point_drop, snarky_oneliner. Each tone has a description, a one-line example, a best_in map per platform, and a safety note. 7 is the number of discriminable voices we actually ship different prompt templates for. More would collapse into each other when measured against real upvote distributions; fewer would leave platforms without a voice that suits them (pattern_recognizer for dev Twitter, snarky_oneliner for large subs, storyteller for LinkedIn career content).

Why is MIN_SAMPLE_SIZE set to 5?

Five is the threshold at which avg_upvotes stops being noise on our data shape. Below 5, one viral comment moves the average enough to promote a tone into the 'dominant' tier by accident, which then starves other tones. At 5 and above, a new viral hit moves the average but not enough to sole-occupy the top tier. Any style with fewer than 5 active samples is pushed into the 'explore' bucket so the model keeps drafting it without the orchestrator pretending to have a verdict yet.

What does 'active' status mean in the bandit query?

The filter is status='active' AND engagement_style IS NOT NULL AND our_content IS NOT NULL AND LENGTH(our_content) >= 30 AND upvotes IS NOT NULL. Active excludes deleted, removed, and inactive rows. The length floor removes test drafts and 'lol' comments, which would otherwise distort the average for tones that favor short replies. Removed or deleted comments are signal-negative, but they are rare enough that pruning them from the aggregation is cleaner than trying to score them.

Why is curious_probe banned on Reddit when it is allowed elsewhere?

PLATFORM_POLICY is not a performance judgment, it is a tone and brand constraint. Questions as top-level comments read as low-value on Reddit, regardless of how they score in a vacuum. Even if the avg_upvotes table said curious_probe was winning on Reddit, the 'never' list would still exclude it. Same rule, applied to LinkedIn and GitHub, hard-bans snarky_oneliner. This is the one place in the system where performance data is explicitly allowed to lose to tone policy.

What is the anti-style and why is it in the prompt?

The anti-style is the pleaser/validator voice: 'this is great', 'had similar results', '100% agree', 'that's smart'. It is not one of the 7 styles; it is what emerges when no style guide is present. The drafting prompt ends with 'AVOID the pleaser/validator style. It consistently gets the lowest engagement across all platforms.' Naming it is how we stop the model from defaulting to it.

How often does the bandit update its tiers?

Every draft. get_dynamic_tiers calls _fetch_style_stats, which runs a Postgres query live. There is no nightly snapshot, no cache, no 'tier recompute' cron. A post that lands at 9:00 with upvotes flowing in is already shifting the avg_upvotes that get read when the 9:15 comment runs. For a cold start with no data for a platform, every non-never tone falls into 'secondary', which is what the prompt calls 'use these ~30% of the time'.

Does the tier system apply to reply drafts as well as new comments?

Yes, plus one extra style. REPLY_STYLES is VALID_STYLES plus 'recommendation'. The recommendation style is tier-independent; its use is governed by the Tier 1/2/3 link strategy in the surrounding prompt, not by performance data, and it is capped at 20% of replies. The other 7 styles go through the same get_dynamic_tiers pipeline whether they are drafting a new comment or a reply.

What stops a single viral post from permanently parking one tone in the dominant tier?

Two things. First, MIN_SAMPLE_SIZE means the average has to be computed over at least 5 posts, so one viral outlier has limited lift power. Second, the tier cut is structural, not threshold-based: tiers are the top third, middle, and bottom third of trusted tones. Even if a tone is winning, if it lands in a small trusted pool (len <= 2), the cut degenerates so dominant holds both and rare is empty. The code is in engagement_styles.py lines 203 through 211.

Can S4L replace Buffer or Hootsuite?

Only if tone automation is what you need. S4L does not have a scheduling calendar, approvals, brand asset management, or a unified inbox, and it is not trying to. It is the tone layer: which of 7 voices should the model use next to maximize the kind of engagement that actually lifts upvotes on this platform. If you want a calendar + inbox, use Buffer or Hootsuite. If you want the tone to self-tune from live data, S4L is the layer that does that.

Run social media marketing automation where the tone tunes itself

S4L's bandit re-ranks 7 named voices from live Postgres data every time it drafts a comment. No calendar to fill in, no template library to prune. The automation is the voice selection, and the voice selection learns.

See S4L