Reddit marketing detection lives in your comment voice, not in your links

Matthew Diakonov, Written with AI

Published May 19, 20268 min read

If you came from X looking for what gets a Reddit comment caught, the honest answer is shorter than most guides will admit. It is almost never the link, and it is rarely the word “product”. It is the way the comment sounds, and whether the specifics inside it actually belong to the person who posted it.

Direct answer (verified 2026-05-19)

Two failure modes get a Reddit marketing comment flagged at the voice level. The first is register: corporate phrasing, em dashes, markdown headers inside a comment, every reply opening the same way. The second is the fabricated personal claim: a number, a duration, or an anecdote that does not belong to the commenter (“I ran 22 cameras for eight months”, “I tried this exact pipeline last semester”). The first one is fixed by stripping machine artifacts. The second one is fixed by a rule, and S4L publishes one in scripts/engagement_styles.py: either open the story with a hedge (“hypothetically”, “scenario:”) or stay strictly first person with no specifics that are not already in your project's config.

The thing the “90/10 rule” guides keep missing

Almost every guide on this question repeats the same advice. Post nine non-promotional comments for every one that mentions your product. Build karma for two to four weeks first. Write like a knowledgeable person, not a press release. None of that is wrong, and none of it tells you what a knowledgeable person actually sounds like at the sentence level. So a comment that passes the 90/10 rule can still get downvoted into oblivion, removed by a mod, or quietly buried by the spam filter, because the comment itself reads as marketing.

The interesting question is not whether you should comment, it is how the comment should sound to clear the filter and the room at the same time. That is a writing problem, and you can hold a generator (or a tired human) to writing rules. You cannot hold either of them to a vibe.

The seven voices in the open-source repo

The S4L repo (the tool this site is about) treats “comment quality” as a structured choice between seven named voices, one per comment. The dictionary is right at the top of scripts/engagement_styles.py. Pick one before you draft. Each one carries its own failure mode in the “note” field, which is the part that does the real work.

scripts/engagement_styles.py

The seven voices, one per comment

critic
Point out what is missing, flawed, or naive. Reframe the problem.
Never just nitpick; offer a non-obvious insight.
storyteller
Narrative-driven. Lead with failure or surprise, not success. Subject to the Grounding Rule below.
Never mix the two lanes (disclosed scenario vs grounded first person).
pattern_recognizer
Name the pattern. Authority through pattern recognition, not credentials.
Best in r/ExperiencedDevs, r/programming, r/webdev.
curious_probe
One specific follow-up question on the most interesting detail. Include 'curious because...' context.
ONE question only. Never multiple. Disabled on Reddit by policy in the repo.
contrarian
Take a clear opposing position backed by experience.
Must have credible evidence. Empty hot takes get destroyed.
data_point_drop
Share one specific, believable metric. Let the number do the talking.
No links. Numbers must be believable, not impressive.
snarky_oneliner
Short, sharp, emotionally resonant observation. Validates a shared frustration.
Never in small or serious subs. Best in 500k+ subs and viral threads.

The note field is the uncopyable part. Most guides will tell you to “ask a question” in a comment. The repo disables curious_probe on Reddit entirely, because the data showed it underperforms there. Most guides will tell you to “share a story”. The repo treats storyteller as the highest-risk voice and enforces a separate rule on top of it, which is the next section.

The Grounding Rule: two lanes, never mixed

When the picked voice is storyteller (or any voice that wants to reach for a specific detail) one extra rule fires. get_grounding_rule() in the same file forces a choice between two mutually exclusive lanes. You cannot combine them. The whole rule exists to kill the single failure mode that gets marketing comments detected on Reddit: presenting an invented specific as a personal first-hand claim.

Lane 1: disclosed story

Invent freely, but flag the frame. Open with one of: “hypothetically”, “scenario:”, “imagine someone running this”, “say a friend tried”. After the opener, you have full creative license on the numbers, the names, the places.

Lane 2: no fabrication

Stay first person, but every specific (number, duration, date, place, course or program name, headcount, named tool, named person) must come verbatim from a real source. If it is not in your notes, drop it, generalize (“a few months”), or pattern-frame (“the part that breaks down is...”). Pattern-framing counts as observation, not autobiography. No disclosure needed.

The same point, three ways

Worked example, taken from the BAD vs LANE 1 vs LANE 2 block inside get_grounding_rule(). Same observation, three different ways to phrase it. Only one of them gets flagged.

Bad: fabricated personal claim, no disclosure

“i ran this exact pipeline last semester for two anatomy blocks, cheap recorder into whisper into gpt into anki, raw gpt got me about 35% usable cards.”

First person, hyper-specific. None of the specifics belong to the commenter. This is the exact shape filters and humans both flag.

Lane 1 rewrite: same details, disclosed

“hypothetically, imagine running this for a couple of lecture blocks: cheap recorder into whisper into gpt into anki. raw prompts get you somewhere around a third usable cards before duplicate distractors and trivial restatements take over.”

The opener pre-frames the rest as a worked example. Reader processes it as illustration, not testimony. The information is identical and the room treats it as fine.

Lane 2 rewrite: pattern-frame, no invented specifics

“the whisper-to-gpt-to-anki setup isn't where this breaks. card generation is. raw prompts produce roughly a third usable before duplicate distractors and trivial restatements take over.”

No first-person claim, no fabricated number that has to belong to anyone. The same observation phrased as a pattern someone has seen. This lane is what works in expert subs.

2 lanes

“This rule outranks "specificity is the #1 authenticity signal" wherever they conflict. Specificity still wins, but only via Lane 1 disclosure or Lane 2 config grounding.”

get_grounding_rule(), scripts/engagement_styles.py

How the voice gets picked, end to end

The repo does not leave voice choice to vibes. The picker is data-driven, the lane is forced by the Grounding Rule, the platform policy adds Reddit-specific constraints, and a mechanical filter strips machine artifacts before anything goes out. Four checkpoints, none of them skippable.

Pick one voice from seven

The picker reads recent click + comment + upvote data per voice and weighs the next pick toward voices that actually drive engagement, not the one that piles up passive likes.

Composite score = avg_clicks * 10 + avg_cm * 3 + avg_up. A real click outweighs ten upvotes of vibes. See _style_score() in engagement_styles.py.

Force a lane on storyteller

If the chosen voice is storyteller, the Grounding Rule splits it into two mutually exclusive lanes. The model has to commit to one before drafting.

Apply platform policy

Reddit policy bans curious_probe entirely and adds 'start with I or my; one punchy sentence or four to five of real substance, never the two-to-three sentence dead zone.'

Strip the artifacts

No em dashes, no markdown, no headers in a Reddit comment. Variation in opening words. Contractions and the occasional lowercase line. These are mechanical filters that run before the comment is allowed out.

What the model actually sees at decision time is a short prompt block: one assigned voice, its description, its anti-pattern, the platform note, and the Grounding Rule. Not a menu of seven, not generic style advice. One voice to write in, one lane to commit to, and a checklist of what the comment cannot do.

picker output

What this means if you are commenting yourself

You do not need to run the tool to use the rule. The two ideas that matter are portable. First, before you draft a comment in a sub where you have a product, pick one voice from the seven and write to it. Do not blend two. Do not start with credentials and pivot to a question. Pick the voice that matches the sub culture and stay in it.

Second, before you write any specific number, duration, or anecdote, ask whether it actually belongs to you. If yes, write it. If no, either open with a hedge (“say someone tried this”) and treat the rest as illustration, or pattern-frame it (“the failure mode here is...”) and drop the first-person frame entirely. The middle ground, “I did this exact thing, here are the made-up numbers”, is the only shape that consistently gets flagged.

The rest (the em dash, the markdown header, the identical opening phrase, the 90/10 ratio) is mechanical. The Grounding Rule is the piece that takes actual judgment, and it is the piece every other guide on this question quietly skips because committing to a fabrication boundary in writing means committing to enforce it.

Read it yourself

The whole file is small enough to read in one sitting. The STYLES dict is right at the top. get_grounding_rule() is roughly two thirds of the way down. The content rules and anti-patterns sit just after it. No dependencies, no fancy machinery. You can copy the rules into a checklist and use them on the next comment you write, with or without a tool in the loop.

File: github.com/m13v/social-autoposter / scripts/engagement_styles.py. The repo itself: github.com/m13v/social-autoposter.

Want this done for your product?

If you would rather have voice-matched, lane-disciplined comments running in your buyer subs without you drafting them, book a call. We run Reddit and Twitter brand-awareness done-for-you for established B2C products.

Frequently asked

What does Reddit actually flag at the comment level?

Two things, in this order. First, voice register: corporate phrasing, perfect punctuation including the em dash, markdown headers and bold inside a comment, openings that all start the same way. Real Reddit users do not type that way. Second, the fabricated personal claim: a specific number, duration, or anecdote that does not belong to the commenter and that nobody else can verify. The first is mechanical to remove. The second needs an actual rule, which is what the Grounding Rule in S4L's open-source repo is.

What is the two-lane Grounding Rule?

Every comment that tells a story has to pick one of two mutually exclusive lanes. Lane 1, DISCLOSED STORY, lets you invent freely, but the comment must open with a phrase that flags it as illustration: 'hypothetically', 'scenario:', 'imagine someone running this', 'say a friend tried', 'as a thought experiment'. After the disclosure, any names, durations, counts, and places are fair game. Lane 2, NO FABRICATION, stays first-person, but every specific (number, duration, date, place, course name, headcount, named tool) must appear verbatim in the matched project's config.json. If a number is not in config, you drop it, generalize ('a few months'), or pattern-frame ('the part that breaks down is...'). You cannot mix the lanes. Mixing them, writing 'I ran 22 cameras for eight months' without either a disclosure or a config anchor, is the exact failure mode the rule exists to kill.

Why is voice the real detection signal, not links or words?

Because Reddit's spam filters and the human moderators above them have spent a decade learning to ignore obvious tells (corporate sign-offs, hashtags, direct URLs) and to escalate on register mismatches instead. A comment with no links and no product mentions that still reads like a press release gets removed, downvoted, or quietly shadowed. A comment that quotes a specific number that does not belong to the commenter triggers the same response from the community: an upvote starts as someone scanning for the cue 'is this person actually one of us'. Voice is the cue.

How many voices does S4L's open-source repo define?

Seven, in the STYLES dictionary at the top of scripts/engagement_styles.py: critic, storyteller, pattern_recognizer, curious_probe, contrarian, data_point_drop, snarky_oneliner. Each has a description, an example, a per-platform best_in list, and a note that calls out the failure mode for that voice. The dict is the active baseline. The model is also allowed to invent a new voice inline at decision time, and inventions land as candidates in a sidecar JSON until a nightly promoter graduates them based on real performance. So the universe is curated plus a learning loop, not a fixed set.

Why does the repo disable curious_probe on Reddit?

Because the data showed it underperforms there. Reddit rewards statements, not questions. 'I did X' beats 'has anyone tried X?' by a wide margin on every sub the tool engages in, so the platform policy block in engagement_styles.py lists curious_probe under reddit.never. The same voice runs fine on Twitter and on niche B2B threads where a probe reads as genuine curiosity. The rule is data-derived, not opinion.

What about generic 'be authentic' advice. Is that wrong?

It is not wrong, it is just unactionable. You cannot enforce 'be authentic' in a drafting pipeline. You can enforce 'pick one of these seven voices, then either disclose the scenario or only use specifics from this config file, then never start with one of these banned openers'. The point of writing the rules down is that a generator (or a human under time pressure) can be held to them and a vibe cannot. The pages that punt with 'be authentic' are essentially saying 'good luck'.

Does this only matter for marketing accounts?

It matters most for any account whose comments get more downvotes or removals than they expect. The detection apparatus does not know your intent, it scores the surface. A founder who genuinely wants to share what they built and types a clean, hedged, voice-matched comment gets engagement. A founder who lapses into a 'we built X, here's why it's great' register gets removed in the same sub for the same reason a paid marketer would. The rules are intent-blind. That is why they are useful.

Can I read the rule and the seven voices myself?

Yes. The repo is at github.com/m13v/social-autoposter and the file is scripts/engagement_styles.py. The STYLES dict starts around line 32. The get_grounding_rule() function with the two lanes and the worked examples starts around line 1078. The get_content_rules() and get_anti_patterns() functions are right after it. The whole file is plain Python, no dependencies, you can grep it and copy the rules into your own drafting checklist without ever running the tool.

Other pieces on the same axis

Keep reading

Filters

Reddit marketing without AI flagging

The mechanical filters that run before a single word is posted: the em dash, markdown in a comment, identical openings.

Read

Framework

Engagement that survives detection

The two-axis model: text artifacts on one side, behavior pattern on the other. Both have to clear for a comment to land.

Read

Drafting

AI Reddit comments without the shadowban

How to draft a comment that reads as a person, not as content-farm output. The drafting axis up close.

Read

Reddit marketing detection lives in your comment voice, not in your links

The thing the “90/10 rule” guides keep missing

The seven voices in the open-source repo

The Grounding Rule: two lanes, never mixed

The same point, three ways

How the voice gets picked, end to end

Pick one voice from seven

Force a lane on storyteller

Apply platform policy

Strip the artifacts

What this means if you are commenting yourself

Read it yourself

Want this done for your product?

Frequently asked

Keep reading

Reddit marketing without AI flagging

Engagement that survives detection

AI Reddit comments without the shadowban

Comments (••)

Comments ()