Claude Opus 4.7 for Reddit and Twitter engagement: same model, two architectures

Run Opus 4.7 against both Reddit and X/Twitter at production volume and you discover the platforms force two different session shapes. Reddit gets one fresh Claude session per comment, $4.73 average, 46.6 seconds wall-clock. Twitter gets one batched MCP browser session over the whole pending queue, $11.18 average, 140.7 seconds. This page is about why the split exists, what the telemetry looks like, and the orchestration boundary that makes both safe to run on real accounts.

Matthew Diakonov, Written with AI

Published May 8, 202610 min

Direct answer (verified 2026-05-08)

Use two separate orchestrators, one per platform, each pinned to its own strict-MCP config and its own logged-in browser profile. On Reddit, spawn a fresh Claude Opus 4.7 session per comment, decide reply-or-skip in JSON, and have the orchestrator (not Claude) post via CDP. On Twitter, spawn one Claude Opus 4.7 session for the whole pending queue and let it drive the same MCP browser through navigate-snapshot-type-click for every reply, because the persistent browser context is what carries cookies and CSRF state between requests.

Verified production telemetry from the claude_sessions table on 2026-05-08, covering every run since the 2026-04-21 cutover to claude-opus-4-7: 4,621 Reddit sessions at $4.73 average and 232 Twitter sessions at $11.18 average. Reference implementation: github.com/m13v/social-autoposter (scripts/engage_reddit.py and skill/engage-twitter.sh).

The thesis

Most guides about Claude for social media engagement assume one workflow: connect a model, drive a browser, post replies. That framing breaks the moment you actually run the loop on more than one platform on real accounts. Reddit and Twitter look similar from a thousand feet, both are public discussion threads with a reply button, but underneath them the orchestration pressures are different enough that they pull the architecture apart.

Reddit lets you fetch the full thread context cheaply over a plain HTTP request. The actual posting of a reply has to go through a logged-in browser, but reading what to reply to does not. So the cheapest correct shape is to fetch the thread in Python, hand the parsed JSON to a fresh Claude session, take its one-JSON-object answer, and have the orchestrator post via CDP. Each comment gets its own session. Each session is short. The model never accumulates context across replies.

Twitter does not let you do that. To read the parent tweet, the siblings, the conversational beat being replied to, you have to be inside a logged-in session because the public anonymous read view is throttled and partial. To post the reply you need the same logged-in session. To capture the URL of the just-posted reply you have to navigate to your own profile inside the same session. The whole flow is anchored to a persistent browser context with cookies and CSRF state, and tearing that context down between replies pays a Chromium-bootstrap cost on every row. So the cheapest correct shape on Twitter is the opposite of Reddit: one long-lived MCP session over the whole queue.

What the telemetry says

The numbers below come from a SQL pull on 2026-05-08 against the claude_sessions table, scoped to model = 'claude-opus-4-7' and the two scripts that drive engagement on each platform, since the 2026-04-21 cutover from 4.6.

Pipeline	Sessions	Avg cost	Avg duration	Avg cache_read	Avg output
engage_reddit	4,621	$4.73	46.6 s	324,388	3,922
engage-twitter-phaseB	232	$11.18	140.7 s	2,348,969	9,627

source: claude_sessions where model='claude-opus-4-7' and script in ('engage_reddit', 'engage-twitter-phaseB') and started_at >= '2026-04-21', pulled 2026-05-08

The single number worth pausing on is the cache_read column. Twitter sessions consume 7.2x more cache_read tokens per session than Reddit sessions. That is the persistent MCP browser making itself felt. Every snapshot the model takes inside a Twitter session reads the system prompt, the tool definitions, and the accumulated conversation back from cache. Reddit sessions take one Bash call and one decision; Twitter sessions take a dozen snapshots, multiple element-ref refreshes after typing, and a final navigation to capture the reply URL.

That is also why Twitter sessions average 9,627 output tokens vs Reddit's 3,922. The model is doing more turn-taking inside the Twitter session: each reply requires a small chain of tool calls (navigate, snapshot, click, type, snapshot, click, snapshot) and each tool call counts as another assistant turn that emits tokens.

What flows where, on each side

The Reddit flow is one row, one session, one decision, one post. The Twitter flow is N rows, one session, N decisions, N posts. Both diagrams below show the same five-actor cast (cron, orchestrator, model, browser, database) but the message pattern is fundamentally different.

Reddit: per-comment session, orchestrator-posts pattern

Twitter: batched MCP session, model-drives-browser pattern

Note where the model lives in each diagram. On Reddit, Opus 4.7 appears once, returns one JSON object, and exits. The orchestrator owns the post. On Twitter, Opus 4.7 stays in the loop for every row and personally calls the MCP tools that navigate, type, and click. The Reddit pattern is closer to a stateless function call; the Twitter pattern is closer to an agent.

The shell traces

The shape of each pipeline is easiest to read off the actual log tail. Here is what one tick of each looks like in production.

engage_reddit.py: one row, one session

reddit pipeline tick

engage-twitter.sh: one batch, one session

twitter pipeline tick

The two prompts, side by side

The structural difference between the prompts is the giveaway. Reddit's prompt fits in a screen and ends with a literal JSON contract. Twitter's prompt is roughly 3x longer because it has to teach the model how to drive an MCP browser through a flaky, rate-limited UI without wedging on a Loading state.

Reddit prompt: short, decision-only

# build_prompt() in scripts/engage_reddit.py
You are helping draft a reply to a comment on reddit
on behalf of the user's account.

## Reply data
{ "id": 18472, "platform": "reddit", "their_author": "...",
  "their_content": "...", "thread_url": "...",
  "post_id": "1cqf...", "project_name": "S4L" }

## Recent archetypes (vary away from these)
critic, storyteller, data_point_drop

## Execution steps
1. Fetch thread cheaply via Bash:
   python3 scripts/reddit_tools.py fetch <thread_url>

2. Decide reply or skip. Output ONE JSON object:
   {"action": "reply", "text": "...",
    "engagement_style": "STYLE_NAME"}

CRITICAL: Your ENTIRE output must be ONLY the JSON object.
No prose, no markdown, no preamble.

Twitter prompt: long, browser-orchestration heavy

# PHASE_B_PROMPT in skill/engage-twitter.sh
You are the Social Autoposter Twitter/X engagement bot.
Read SKILL.md for the full workflow.

## Respond to pending Twitter/X replies (12 total)

CRITICAL: Use the SAME mcp__twitter-agent__ browser session
for every row. Do NOT call scripts/twitter_browser.py reply
(it launches a second Chromium against the same profile dir
and wedges x.com on a Loading state).

MANDATORY reply flow for every item:
  Step 1: python3 reply_db.py processing ID
  Step 2: NAVIGATE to their_comment_url, snapshot, read context.
          Resolve parent tweet via lookup_post.py twitter <id>.
  Step 3: Draft reply (1-2 sentences, match parent language).
  Step 3a: ACTIVE CAMPAIGN SUFFIX. Flip a coin at sample_rate.
          On heads, append the literal suffix verbatim.
  Step 4: snapshot, click reply textbox ref, type, snapshot,
          click [data-testid="tweetButtonInline"], wait 3s,
          snapshot to verify post landed.
  Step 4g: navigate to /m13v_/with_replies, capture REPLY_URL.
  Step 5: python3 reply_db.py replied ID "text" URL STYLE
  Step 5a: campaign_bump.py if CAMPAIGN_FIRED=1.

After every 10 replies: python3 reply_db.py status

The Reddit prompt's contract with Opus 4.7 is JSON-only, enforced by "Your ENTIRE output must be ONLY the JSON object." The Twitter prompt's contract is MCP-tool-only, enforced by step-by-step instructions on which tool to call and what to do when a click fails. That second contract is more fragile and exists because there is no cheaper option: the orchestrator cannot drive the Twitter browser itself without duplicating the MCP server, which would create the wedged-on- Loading bug we explicitly warn against in the prompt.

The four design decisions, in order

If you were going to build the same pattern from scratch on a third platform tomorrow, this is the sequence we would walk through. Each step locks in a constraint that the next one inherits.

Building a per-platform Opus 4.7 engagement loop

1
Pick the session lifetime
Reddit: one session per row. Twitter: one session per batch. The platform constraint is browser-profile contention, not model behavior.
2
Wire strict-MCP per platform
reddit-agent-mcp.json declares only the reddit-agent server. twitter-agent-mcp.json declares only the twitter-agent server. The session physically cannot drive the wrong browser.
3
Decide who appends the suffix
Reddit: orchestrator in Python (CDP-posted, tool-layer enforced). Twitter: LLM appends by hand inside the MCP session (browser_type has no injection point).
4
Log per-session telemetry
Both pipelines write to claude_sessions with model, total_cost_usd, duration_ms, cache_read_tokens. That table is what makes the Reddit-vs-Twitter cost shape visible at all.

Why the campaign suffix lives in two different places

One concrete consequence of the architecture split is where the A/B campaign suffix gets appended. On Reddit the orchestrator looks up the active campaign in Postgres after the model returns its JSON, flips a coin at sample_rate in Python, and appends the literal suffix to the drafted text before the CDP submit. The model never sees the suffix, never knows there is an A/B test running, and cannot mention either thing on Reddit. The literal text rule is enforced at the tool layer.

On Twitter that injection point does not exist. The MCP browser_type call is what actually puts characters into the textbox, and it is being driven by the model. So we precompute the active campaign suffix in the shell, embed the literal string in the prompt, and tell the model to flip a coin at sample_rate and append on heads. This is structurally less safe. A model that hallucinates or paraphrases the suffix will land paraphrased text on Twitter while the database row records the paraphrase as the actual content. The mitigation is the literal- text rule spelled out in capitals ("NEVER paraphrase or reformat the suffix"), and the post-hoc audit job that compares replied content against the campaign suffix regex-style.

Why the launchd cadences are 18x apart

The Reddit loop fires every 600 seconds. The Twitter loop fires every 3 hours. Two reasons for the gap. First, Reddit sessions finish in 46.6 seconds on average, so a 10-minute cadence has plenty of headroom. Twitter sessions average 140.7 seconds and can stretch past 5 minutes in larger batches, so a 10-minute Twitter cadence would mean overlapping ticks racing for the same Chromium profile dir.

Second, the Twitter Chromium profile is shared with twitter_post_plan.py (which posts originals), the Twitter DM pipeline, and Phase A's notification scan. A 3-hour cadence gives the lock file at /tmp/twitter-browser.lock time to drop and lets the queue accumulate enough rows that the batched-session amortization actually pays off. Reddit has its own profile and its own lock at /tmp/reddit-agent.lock, and a 10-minute cadence means a thread that mentions us at minute 3 gets a reply by minute 13 instead of by hour 4.

What this looks like in the open-source repo

Every claim on this page traces to a specific file in the public repo at github.com/m13v/social-autoposter:

Reddit per-comment loop: scripts/engage_reddit.py (one Claude session per pending row).
Twitter batched loop: skill/engage-twitter.sh (one Claude session per cron tick, BATCH_SIZE=500).
Strict-MCP configs: ~/.claude/browser-agent-configs/reddit-agent-mcp.json and twitter-agent-mcp.json, each declaring exactly one MCP server.
Telemetry table: scripts/log_claude_session.py writes one row to claude_sessions per spawned Claude invocation, with model, total_cost_usd, duration_ms, and the cache token columns this page is sourced from.
Launchd cadences: ~/Library/LaunchAgents/com.m13v.social-engage-reddit.plist (StartInterval 600) and com.m13v.social-engage-twitter.plist (StartCalendarInterval, 3h spacing).

The lesson, if there is one

Picking Opus 4.7 is the easy part. The actual orchestration decisions, where the model lives in the loop, who owns the post, where the campaign suffix gets injected, how long the browser context survives, are not symmetric across platforms. Reddit and Twitter look like the same problem until the cookies and CSRF state push back, and once they do, two pipelines on one model is the cheapest correct answer.

Both pipelines share the same model setting in ~/.claude/settings.json, the same stricter literal instruction following that landed with 4.7, and the same hard split between "model proposes" and "deterministic code disposes." The part that diverges is where the boundary between proposal and disposition lives, and whether the model gets to keep its fingerprints on the browser between rows.

Want this loop running on your accounts?

Twenty minutes on a call. We walk through the platforms you care about, what the per-platform session shape would look like for your account state, and whether the per-comment Reddit pattern or the batched Twitter pattern fits your traffic.

Implementation questions, answered with specifics

Why not run one orchestrator that covers both Reddit and Twitter?

Each platform requires its own logged-in browser profile, and a single Claude session that has both MCP servers attached can drive either browser at any moment. That is what gets accounts cross-contaminated: a tool call that was supposed to fetch a Reddit thread navigates the Twitter tab instead, and a draft meant for r/SaaS lands as a tweet. Two orchestrators with two strict-MCP configs (one wired to reddit-agent, one wired to twitter-agent) make that physically impossible.

Why is the average Twitter session more expensive than the Reddit session, even though Twitter batches?

Cache reads. The Reddit session fetches a thread once via a cheap Bash subprocess, drafts a reply, and exits. The Twitter session lives inside an MCP browser that has to navigate to the notification, snapshot the DOM, navigate to the thread, snapshot again to read context, navigate to the reply textbox, snapshot again to refresh element refs, post, then navigate to /m13v_/with_replies to capture the URL. Each snapshot is loaded from the prompt cache, and on Opus pricing those cache_read tokens add up. Telemetry: Twitter sessions average 2,348,969 cache_read tokens vs 324,388 for Reddit, a 7.2x ratio.

Where does the campaign suffix get appended on each platform?

Reddit: in Python, after the model returns its JSON. The orchestrator owns the post via CDP, so it can append the literal suffix and submit. The model never sees it. Twitter: the LLM appends it by hand inside the same MCP session. The phase-B reply path goes through mcp__twitter-agent__browser_type, which has no tool-level injection point, so the prompt embeds the literal suffix and instructs the model to flip a coin at sample_rate, append on heads, leave on tails. The Reddit pattern is safer (tool layer enforced) but only possible because the orchestrator is the one calling page.click.

How often does each loop fire?

Reddit fires every 600 seconds (10 minutes) via launchd. Twitter fires every 3 hours via a StartCalendarInterval block. The cadence gap is not arbitrary. Reddit sessions are short (46.6 second average) and isolated, so frequent ticks are cheap. Twitter sessions average 140.7 seconds and tie up the same Chromium profile that twitter_post_plan.py and the DM pipeline also share, so a 10 minute Twitter cron would mean overlapping Chromium contention. Three hours is the cadence at which the lock has time to drop and the queue has time to fill.

Why does engage_reddit.py spawn a new UUID per comment instead of reusing one session?

Two reasons, both observed in production. First, context contamination: a session that has drafted 50 replies starts pattern-matching its own prior outputs and the seventh reply visibly echoes the second. Second, blast radius from the Anthropic Usage Policy classifier: once it refuses inside a session, every subsequent reply inherits the refusal and burns $0.05 to $0.30 of fully wasted spend. A fresh UUID per comment forces real variance and isolates refusals to one row.

Can the Twitter pattern be retrofitted to per-tweet sessions?

We tried. The cost per session dropped (sessions are shorter when they only handle one reply) but total cost across the queue went up because each session paid the MCP browser bootstrap cost from scratch. The Chromium launch, the cookie load, the first snapshot of x.com all became per-reply overhead instead of amortized across the batch. We reverted. The batched MCP pattern is the right answer for Twitter despite looking less clean than the Reddit per-comment loop.

What does the model never get to do, on either platform?

Three things, both platforms. It never writes to the database directly: every status update goes through reply_db.py, which validates state transitions. It never sees the campaign list or the campaign suffix on Reddit; the orchestrator looks up the active campaign in Python after the model decides reply-or-skip. On both platforms it never sees the project-level voice block of OTHER projects when drafting; only the matched project's voice is injected. The pattern is consistent: model proposes, deterministic code disposes.

Why no LinkedIn coverage in the same orchestrator?

LinkedIn flagged scripted browser activity in the past, so we keep our public marketing surface honest about which platforms this product engages on. Reddit and X/Twitter (plus GitHub for issue triage) are the engagement surfaces. The orchestrator pattern would technically port to other platforms, but we are not running it there.

Other pages on the same engagement stack

Adjacent reading

Architecture

Claude Opus 4.7 for Reddit comment automation, in production

The Reddit half of the picture, in detail: per-comment sessions, the seven engagement archetypes, the prompt anatomy, and what 4.7 broke that 4.6 used to forgive.

Read

Automate Reddit DM replies from the CLI

The DM half of Reddit: how engage-dm-replies.sh sequences first-touch, second-touch, and third-touch DMs without burning the inbox.

Read

Operations

Reddit marketing without getting AI-flagged

What Reddit's anti-AI heuristics actually look like when you watch them fire on real accounts, and the prompt-side patterns that keep replies under the radar.

Read