A social media agent orchestrator is really a lock manager with good taste in what runs when.
Every article about "social media agent orchestrators" talks about one meta-agent delegating to worker agents. That framing is fine until you run it on a real machine. The moment two Claude sessions launch Chrome against the same ~/.claude/browser-profiles/reddit directory, one of them corrupts the user-data-dir and exits. S4L fixes that with roughly 70 lines of shell.
The SERP version is missing the interesting part
Read the top five results for this keyword and you get the same summary. Worker agents write, schedule, monitor. A meta-agent delegates. Productivity goes up 20x. Buffer, Sprout, Optimizely, MindStudio, LangChain: nearly identical framing. None of them name a single real failure mode of running more than one agent at once.
Here is the one that eats a weekend: two agent loops start within the same minute, both decide the Reddit profile is the right place to be, and both try to launch Chrome against the same persistent userDataDir. The second one fails with SingletonLock or corrupts the profile. If it corrupts it, you lose cookies, the next scheduled run logs in as nobody, and Reddit rate-limits the login attempts. Now your orchestrator is ghosting the inbox.
This is an OS-level problem, not an LLM-level problem. You do not solve it with a better prompt. You solve it with a lock.
What the orchestrator actually looks like
Four different Reddit pipelines, all scheduled by launchd, all eventually needing to drive the same Chrome profile. The reddit-browser lock is the chokepoint every one of them has to pass through before it can talk to the profile, the playwright-mcp stdio server, or the database.
Every Reddit pipeline funnels through the reddit-browser lock
Four numbers that describe the orchestrator
The lock stack, in full
This is skill/lock.sh. Every pipeline sources it. Every pipeline calls acquire_lock twice at the top. Nothing else is needed.
The call site, verbatim
This is how skill/link-edit-reddit.sh opens. Two lines. Order is not negotiable.
Two tiers, one rule
Every lock in the orchestrator falls into one of two categories. The rule is that the shared-resource tier is acquired first.
Platform-browser locks
One lock per browser profile. reddit-browser, twitter-browser, linkedin-browser, moltbook-browser, github. Shared across every pipeline that touches that platform.
Pipeline locks
One lock per launchd job. link-edit-reddit, dm-outreach-reddit, engage-twitter, audit-linkedin, and so on. Prevents a pipeline from overlapping with its own previous run.
Stacking rule
Browser-profile lock FIRST, pipeline lock SECOND. _SA_LOCK_DIRS tracks them in order. A single EXIT trap releases the whole stack, even on SIGINT or SIGTERM.
Staleness detection
If /tmp/.../pid is missing, or kill -0 $pid fails, or the lock dir is older than 10,800 seconds, the contender wipes it and retries. No cron will wedge the whole fleet.
Skip, do not queue
If the lock is still held after the timeout (default 3600s), the contender exits 0. launchd will retry on the next scheduled tick. You never end up with a backlog of half-finished Chrome tabs.
No flock, no Redis
The whole thing is mkdir + a pid file. Works on stock macOS and stock Linux. No extra daemon, no Redlock edge cases, no lock server to page on at 2 AM.
Every lock in the fleet
These are the actual acquire_lock call sites, dedup-sorted out of skill/*.sh. Five platform-browser locks at the top, pipeline locks below. Every pipeline picks one from each tier, in that order.
A collision, played out in real time
launchd does not know about the lock. Two plists fire at the same clock minute and both pipelines start. Here is the log trail from a real collision in skill/logs/, rewritten for readability.
The same story as a sequence diagram
Two pipelines, one browser profile
What you need to copy, if you are building your own
The whole pattern is eight bullets. Everything else, including launchd plists, per-platform MCP configs, and the pipelines themselves, is plumbing around these.
Orchestrator-safety checklist
- A single run is started by launchd, not by a long-lived daemon
- Each run sources skill/lock.sh and calls acquire_lock at the top
- The first acquire_lock call is always the browser-profile lock
- The second acquire_lock call is the pipeline-specific lock
- A held lock is a directory containing a pid file written with echo $$
- A contending run polls every 10 seconds, up to its timeout, then exits 0
- A stale lock (dead pid, missing pid file, or 3h+ old) is wiped and retried
- Every lock held by this run is released by a single EXIT/INT/TERM/HUP trap
S4L vs. what you usually get
Most agent platforms handle concurrency by scaling horizontally and hoping the blast radius stays small. That works until the workers share state. Browser profiles share state.
| Feature | Typical agent framework | S4L orchestrator |
|---|---|---|
| Multiple agents running concurrently | Usually yes, often without a safety net | Yes, as long as they touch different browser profiles |
| Two agents hit the same Reddit session | Both launch Chrome against the same user-data-dir, one crashes | One holds the reddit-browser lock, the other sleeps 10s and retries |
| Crash recovery | Manual lock file cleanup, or wait for the daemon to restart | PID-based staleness + 3h mtime fallback, next run auto-cleans |
| Backlog behavior | Queue runs, backlog compounds, Chrome tabs pile up | Skip the tick, let launchd fire again later |
| Infrastructure required | Redis, Zookeeper, job queue, scheduler daemon | mkdir + pid file + launchd, zero external services |
| Lines to read to fully understand it | Thousands across runner, broker, worker pool | About 70 (skill/lock.sh) |
Frequently asked questions
Why do you need two levels of locks instead of one?
Two different pipelines can share a browser profile but not a task. On Reddit, engage.sh (replying to inbound comments), link-edit-reddit.sh (editing our top-performing comments to add a link), dm-outreach-reddit.sh (sending DMs), and scan-reddit-replies.sh (discovering new replies) all log in as the same account. They must take turns on the reddit-browser profile. But each pipeline also has to prevent overlap with its own previous run. The browser-profile lock handles cross-pipeline contention, the pipeline lock handles self-contention.
Why acquire the platform-browser lock BEFORE the pipeline lock?
If you acquire pipeline first and browser second, two different pipelines can each grab their pipeline lock, then both block on the same browser lock. Neither can exit, neither can release. You get a cross-pipeline deadlock. Acquiring the shared resource first means contention is resolved before anyone commits to a sub-lock. It is the same reason database transactions should lock parent rows before child rows.
What happens if a run crashes while holding a lock?
Three fallbacks catch it. First, the shell trap on EXIT/INT/TERM/HUP removes every lock directory in _SA_LOCK_DIRS. Second, the next contender checks the pid file and runs kill -0; a dead pid means the lock is removed. Third, any lock directory older than 10,800 seconds (3 hours) is wiped regardless. Together these mean the orchestrator self-heals, no cron job will wedge the whole fleet.
Why not just use flock?
flock is not installed by default on macOS, and the orchestrator has to run on both macOS and Linux. mkdir is an atomic syscall on every POSIX filesystem, so it works as a spin-lock primitive without any dependencies. The whole implementation is ~70 lines in skill/lock.sh. No extra package, no external service, no per-platform shim.
Why mkdir and not touch or a pid file alone?
mkdir fails atomically if the directory exists, which is exactly the primitive a lock needs. touch would succeed for every contender and race on the pid write. A pid file alone lets two contenders both write their pid and both think they own the lock. mkdir plus write-pid-inside gives you atomicity from the filesystem and a recoverable holder identity.
What is the contention cost in practice?
Contenders sleep 10 seconds between retries. The default timeout is 3600 seconds (1 hour), so a contender will poll up to 360 times. In practice launchd stagger plus a median run length under 15 minutes means most contenders wait zero seconds. When two jobs collide, the later one typically waits one or two polls, then proceeds.
Can I use this pattern outside social media automation?
Yes. The pattern is 'shared resource first, pipeline second, stacked trap-cleaned locks, mkdir plus pid staleness detection.' It applies anywhere you have multiple cron jobs or agents that share a stateful resource: a browser profile, a serial port, an OAuth refresh token, a non-reentrant CLI. The 70 lines in skill/lock.sh are deliberately generic so you can source them as-is.
Want an orchestrator that already has this wired up?
S4L is open about how it runs. The lock stack, the launchd plists, and the per-platform MCP configs are the product. Point it at your projects, let it post, reply, and link-edit on its own cadence.
See S4L →