Reddit comment discovery overhead is a 5-gate cascade, not a search bill

Matthew Diakonov, Written with AI

Published May 22, 20269 min read

If you arrived from an X reply, the question you are probably asking is: why does finding good Reddit threads to comment on feel so expensive? Most guides answer with a single number: 10 to 15 hours a week, or 79 dollars a month. Both are wrong. The actual cost is spread across five filter stages, each one discarding more candidates than it keeps. The open-source S4L pipeline writes every stage down with named thresholds, which is the only way to see the shape of the overhead. Here it is.

Direct answer (verified 2026-05-22)

Reddit comment discovery overhead is the sum of cost paid on candidates that never become comments. The pipeline has five stages: fetch (up to 50 subreddits per cycle in 10 multi-sub batches, capped by MAX_DISCOVER_SEARCHES=3 fresh searches), ripen (two snapshots 5 to 30 minutes apart, composite = Δup + 4·Δcomments + 5·intent must clear 1), draft (LLM relevance gate; on omit, the row is marked permanently failed under the one-strike rule), post (CDP browser submit; transient failures retry up to MAX_ATTEMPTS=3 with RETRY_BACKOFF_MIN=30), and salvage (Phase 0 hard-expires anything older than FRESHNESS_HOURS=24). Each stage's discards are the actual overhead. The published values live in scripts/post_reddit.py and scripts/ripen_reddit_plan.py.

The shape of the cascade

A useful frame: discovery is not a queue of things to do, it is a queue of things to discard. Every stage's job is to throw work away cheaply so the next stage spends its budget on a smaller, better pool. The trap most playbooks fall into is treating “hours spent searching” as the whole bill, when search is the cheapest step in the cascade. Most of the dollars land on candidates that get fetched, snapshotted twice, and then dropped at a gate two or three steps later.

One Reddit thread through the cascade

Gate 1: fetch (the only stage most pages count)

The pipeline hits up to 50 subreddits per cycle by batching them five at a time into multi-sub requests (Reddit's old.reddit JSON API accepts r/a+b+c+d+e), capped at 10 batches per cycle to stay inside the public rate limit. A clean cycle uses 10 to 15 HTTP requests with a 4-second base delay; on a 429, exponential backoff stretches the delay to 30 seconds, and 3 consecutive failures bail the cycle. The number of fresh searches the LLM is allowed to author for the cycle is bounded by MAX_DISCOVER_SEARCHES = 3 (overridable via SAPS_REDDIT_MAX_SEARCHES).

This is the line item most “Reddit marketing tool” reviews quote when they say discovery is expensive. It is not, in dollars. Reddit publishes a public JSON endpoint that costs zero per call, and the throughput limit is set by their rate-limiter, not by the pipeline. The fetch stage's real cost is correlation: every thread it surfaces is a thread the later stages will spend on.

Gate 2: ripen (the gate nobody else writes down)

A freshly-found thread looks alive at fetch time and might be dead 5 minutes later. Posting a careful reply on a thread whose author has already drifted off and whose readers have stopped clicking is the most common form of wasted draft cost. The ripen stage exists to catch exactly that: snapshot the thread at T0, sleep, snapshot at T1, and compute a momentum composite.

ripen_reddit_plan.py defaults

The 4x weight on comments versus upvotes is deliberate: one new comment on a thread is a stronger signal that someone is still reading than four new upvotes. The +5 intent boost lets a flat-but-on-theme thread (one upvote, no new comments, but the title literally asks “what do you use for X”) clear the floor it would otherwise miss. The floor sits at 1 instead of 5 because once the intent boost shipped, dropping the raw floor was safe: pure-noise threads still need a real upvote or new comment to pass.

-30%

“Title-only intent matching cut the false-positive rate from selftext-included matching by roughly a third on our internal dataset. The selftext arg stays in the function signature for future use but is ignored today.”

scripts/ripen_reddit_plan.py, _intent_boost() docstring

Gate 3: draft (the LLM relevance check)

Even on a thread that cleared momentum and shows on-theme title intent, there is one more discard step. The draft prompt asks the LLM to decide whether the thread's actual audience has a plausible bridge to the project being represented. Plenty of threads pass the algorithmic gates and still produce token-overlap false positives: an r/programming thread about “best tool for X” that turns out to be about C++ build systems, when the project being represented is a tweet-ghostwriting service. The LLM can read that, and when it does, the candidate is marked status='failed' with reason draft_gate_omit.

The reason this is treated as permanent and not transient: the same dead thread would otherwise keep clearing momentum (engagement on the thread is real, just not for our audience) and the pipeline would re-pay the fetch and gate cost every cycle on a decision that always lands the same way. The one-strike rule went live on 2026-05-07 and dropped the recurring spend on permanently-omitted threads to zero.

Gate 4: post (where the queue earns its keep)

The post stage drives a real browser through CDP to submit the reply. Failures here split into two buckets, and the bucket determines whether the discovery cost is recovered or wholesale lost.

CDP failure classification (scripts/post_reddit.py)

Permanent failures mark status='failed' so they never re-enter the cascade. Transient failures stay status='pending', bump attempt_count, and get salvaged by the next cycle's Phase 0 as long as the row is still inside FRESHNESS_HOURS=24 and below MAX_ATTEMPTS=3. If the salvaged row's draft is younger than DRAFT_TTL_MIN=60, the LLM redraft is skipped entirely. That is where the persistent queue earns its place in the architecture: without it, every transient browser crash would discard the discover+ripen+draft sunk cost.

Gate 5: salvage (the cleanup that bounds the bill)

Phase 0 of the next cycle does two things. It hard-expires any pending row older than 24 hours (no audience left to reach), and it re-assigns still-fresh orphan rows to the current batch_id so they get one more retry. The 30-minute retry backoff stops the pipeline from re-attempting a freshly-failed candidate inside the same cycle window; the failure reason needs time to stabilize before we know whether it was transient.

The queue schema makes the bookkeeping cheap. One row per thread URL (UNIQUE constraint), every state transition recorded with last_failure_reason, and three indexes carefully chosen for the access patterns Phase 0 actually performs (salvage SELECT, per-cycle COUNT, log_post lookup). Full schema is in scripts/migrate_reddit_candidates.sql.

What the cascade tells you about cost per posted comment

Run this math against your own discipline. If you scroll /new for 30 minutes and pick 3 threads to comment on, you have run the same five gates by eye. The fetch was the scroll. The ripen was the moment you skipped a thread with no replies in an hour. The draft gate was the moment you decided the audience would not bridge. The post gate was the moment your reply went up. The salvage gate was the moment you closed the tab and never came back. The pipeline writes these moments down so the cost is measurable; the manual version feels free until you tally the hour.

A useful rule of thumb from running the OSS pipeline in production: the ratio of fetched threads to posted comments lands somewhere between 30:1 and 80:1 on a tight ICP. Most of the discard happens at ripen and the LLM draft gate. The fetch stage's output is wide on purpose, because a narrow fetch reaches the gates with too little to work with and they starve. A wide fetch with strict gates costs the same as a narrow fetch with loose gates, and the first one ships better comments.

If you are doing this without the pipeline

The minimum useful version is shorter than the OSS repo. Snapshot a thread's upvotes and comment count when you find it. Wait at least 5 minutes. Re-fetch. If neither number moved and the title is not literally asking for a tool, skip. If the title is asking for a tool, draft anyway. Keep a list of threads you have already decided against. Re-check that list before drafting anything new.

That single discipline (the do-not-revisit list) recovers most of the dollar-equivalent that the persistent queue gives you. Most people do discovery without one and end up reconsidering the same dead thread three times in a week.

Want this running on your subreddits without you scrolling /new?

We run the cascade as a done-for-you brand-awareness service: priced at $2 per 1,000 impressions and $50 per 1,000 site visits, no retainer.

Frequently asked questions

What does 'discovery overhead' actually mean in a Reddit pipeline?

Discovery overhead is the total work (API calls, snapshot storage, polling cycles, LLM tokens, retry attempts) you pay to surface one thread you will actually comment on. The misleading thing is that 'comment count divided by hours spent' makes it look like a single denominator. In practice the cost is spread across at least five gates, and the dominant share lands on candidates that get fetched, snapshotted, sometimes drafted, and never posted. The pipeline only looks cheap if you only count the threads that survived.

How many subreddits does one S4L discover cycle hit?

Up to 50. The fetch step batches subreddits into groups of 5 (Reddit's old.reddit JSON API accepts r/a+b+c+d+e multi-sub requests) and caps the run at 10 batches per cycle to stay inside the public rate limit. See fetch_reddit_threads() in scripts/find_threads.py. With a 4-second base delay and exponential backoff to 30 seconds on a 429, a clean cycle uses 10 to 15 HTTP requests; a throttled cycle bails at 3 consecutive failures with whatever it collected so far.

What is the momentum gate and why is the floor set at 1?

The momentum gate is a two-snapshot delta check: capture upvotes and comment_count at T0, sleep 5 to 30 minutes, capture again at T1, compute composite = (T1_score - T0_score) + 4 * (T1_comments - T0_comments) + 5 if the title matches a product-discussion regex, else 0. If composite is at least 1, the thread proceeds. Below 1, it is dropped. The floor was 5 originally and dropped to 1 once the intent boost shipped, because perfectly on-theme but quiet threads (one upvote in 30 minutes on a title literally asking for a tool) were getting filtered out for the wrong reason. Code at scripts/ripen_reddit_plan.py.

Why does the intent regex match the title only, not the selftext?

Earlier versions matched both. Reddit selftext can be thirty thousand characters of narrative (camping ghost stories, long product reviews, debugging logs) where the phrases 'looking for' or 'recommend' appear in their everyday sense. False-positive rates above 30 percent collapsed the gate's signal. Titles are short, deliberate, and intent-rich. The selftext argument is still in the function signature for future use, but it is ignored today. See _intent_boost() in scripts/ripen_reddit_plan.py.

What is the draft_gate_omit and why does it count as 'permanent'?

After momentum, the candidate goes to an LLM relevance check that asks 'is there a plausible bridge between this thread's audience and this project'. When the LLM answers no, the candidate is marked status='failed' with reason='draft_gate_omit'. It is treated as permanent because the same dead thread would otherwise keep clearing the momentum gate cycle after cycle (engagement on that thread is real, just not for our audience), and we would burn roughly five cents per cycle re-paying the fetch and gate cost on a decision that always lands the same way. The one-strike rule shipped 2026-05-07; see _db_mark_candidate_attempt() in scripts/post_reddit.py.

What is the cheapest line to cut if I want to lower discovery overhead?

The LLM draft step. Fetch and momentum are cheap (HTTP and arithmetic). Draft tokens dominate. Two levers help. First, DRAFT_TTL_MIN=60 means a salvaged candidate whose draft was written under an hour ago re-uses the existing text instead of paying for a redraft (mirrors twitter_candidates). Second, marking draft_gate_omit as permanent stops re-paying the same omit cost on the next cycle. Both are visible in scripts/post_reddit.py. If you are doing this manually, the equivalent is keeping a do-not-revisit list of threads your judgment has already rejected.

How many fresh discovery searches does the pipeline allow per cycle?

Three, by default. The constant is MAX_DISCOVER_SEARCHES in scripts/post_reddit.py and reads from the SAPS_REDDIT_MAX_SEARCHES environment variable. It was 2 before May 2026, then bumped to give each cycle a wider top-of-funnel and let the draft-gate-omit feedback report steer rephrasings without starving the next attempt of fresh angles. The cap exists because more searches without better signal at the gates downstream just multiplies cost without lifting the surviving comment count.

Where is the actual queue stored and what does each row record?

Postgres, table reddit_candidates. One row per discovered thread URL (UNIQUE constraint on thread_url). Each row records the T0/T1 metrics, the composite delta, the draft text and its engagement style with drafted_at for the 60-minute TTL, the queue status (pending, posted, skipped, expired, failed), attempt_count, last_attempt_at, last_failure_reason, batch_id for cycle scoping, and post_id once it links to the posts table. Schema in scripts/migrate_reddit_candidates.sql. The point of persisting all of it is that the next cycle's Phase 0 salvages still-fresh rows instead of treating a CDP timeout as wholesale loss of the discover+ripen+draft sunk cost.

Does manual discovery have the same five gates?

Yes, but you do not see them. When a human scrolls /r/ClaudeAI/new for 20 minutes and picks 'three threads worth replying to', they are silently running the same five filters: fetch (skim), freshness (skip anything older than a day), momentum (skip threads with zero comments in an hour), intent (skip threads not asking for a tool), and relevance (skip threads where your reply would not bridge). The overhead is hidden in the time it takes a brain to do those filters at human read speed. The pipeline writes it down because that is what makes the cost measurable; the manual equivalent feels free until you tally the hours.

What is the FRESHNESS_HOURS=24 cap protecting against?

Stale candidates that survived earlier gates but no longer have an audience. A thread that passed momentum at hour 2, failed CDP posting at hour 6 (browser crashed), and is still sitting pending at hour 28 is unlikely to get any new eyeballs even if we successfully post now. The Phase 0 salvage step hard-expires pending rows past 24 hours. Reddit's own decay is longer than Twitter's (FRESHNESS_HOURS=6 there), so the cap is wider, but past 24 hours the cost-benefit of retrying flips. See the comment at the top of scripts/post_reddit.py.