What This Example Shows
- A
ConcurrentWorkflowof four specialists — Transcript Analyzer, Guidance Tracker, Q&A Sentiment, Material Disclosure Detector — fanning out in parallel, then merged by a Synthesizer agent - Per-agent function tools in OpenAI schema (
tools_list_dictionary):score_tone,compare_guidance,detect_material_disclosure,extract_qa_red_flags - Mixed-provider model routing: Claude Sonnet 4.5 for analysis and synthesis, GPT-4.1 for guidance arithmetic, Claude Haiku 4.5 for cheap sentiment, Claude Opus 4.8 with
reasoning_effort: "high"for the high-stakes disclosure check - How to scale a single call across the full earnings-season firehose using
/v1/swarm/batch/completions - A cron pattern that fires at 5pm ET each day so the desk wakes up to a structured note on every transcript filed the day before
- Real per-call and per-day economics against a sell-side analyst at peak burn
Batch completions and observability dashboards are Premium-tier features. During the 3-week earnings burst a sell-side desk will push 500+ calls through this pipeline — that’s exactly the workload Premium rate limits, parallel batch execution, and per-agent token tracing exist for. See the Night-Mode Pricing Strategy guide for why running the batch off-peak (after the 4pm ET tape) is the right default for any scheduled earnings workload.
Why This Matters
A sell-side analyst on a 14-hour day during peak earnings season can read, parse, model, and write up maybe six to eight earnings calls before they fall over. A US large-cap sector publishes 125 prints on a single Thursday in late January or late April. The math has never worked. Every call after the eighth is either (a) skimmed from the press release and a Bloomberg headline, (b) covered the next morning when the tape has already moved, or (c) skipped entirely. The job of this pipeline is not to replace the analyst’s read — it is to put a structured note on every call before the analyst sits down: tone score, guidance delta vs. last quarter, Q&A red flags, and a flag for anything that looks like a material disclosure. The human then spends their finite hours on the names that actually need their judgment. By 4:45pm ET your DB has a structured note on every call — tone, guidance delta, red-flag Q&A — for under $50 a day.The Architecture
Four specialists run in parallel against the same transcript. A Synthesizer agent merges their outputs into one structured note, which gets persisted to the research DB and pinged to Slack.Step 1: Setup
Install dependencies and set your API key. The Slack webhook is whatever you’ve wired into#earnings-firehose.
Step 2: Define the Function Tools
Each specialist gets one tool, scoped per-agent in OpenAI function schema. The tools force structured output: tone scores, guidance deltas, disclosure verdicts, and Q&A red flags all come back as parseable JSON the synthesizer (and your DB schema) can rely on.Step 3: Define the Five Agents
Four specialists run in parallel, each on the model that gives the best cost/quality tradeoff for the job. The Material Disclosure Detector is the only one that gets Claude Opus 4.8 withreasoning_effort: "high" — that’s the agent whose mistakes are most expensive (a missed Reg FD disclosure on a name your fund holds is a compliance event). The Synthesizer runs Claude Sonnet 4.5 because it has to merge four structured outputs into a clean note without inventing anything.
Function tools force structured outputs at the source, not at the synthesizer. By the time the Synthesizer reads the four specialist outputs, the tone scores, guidance deltas, and disclosure list are already typed JSON — the Synthesizer is just merging, not parsing free-form text. This is what makes the JSON contract reliable enough to feed straight into a Postgres or Snowflake table.
Step 4: Process One Call End-to-End
Start with a single transcript. This is the loop you scale.REVIEW_NOW flag on a portfolio name, you can trace it straight back to the verbatim quote the Material Disclosure Detector picked up.
Step 5: Earnings-Season Batch Mode
The single-call path is the prototype. The actual workload during peak earnings season is 125 calls a day. POST the entire list as one payload to/v1/swarm/batch/completions — the API executes the swarms in parallel and returns a list of results in input order.
Most earnings calls during peak season land between 4:30pm and 5:00pm ET, immediately after the tape. Firing at 5:00pm ET catches the tail of late prints and gives the batch a clean window before market open the next morning. Premium-tier rate limits matter here: 125 ConcurrentWorkflows fired in a single POST consume meaningful concurrency, and queue contention on a free tier turns a 6-minute batch into a 90-minute one. The Night-Mode Pricing Strategy guide covers the throughput tradeoffs in depth.
Real Cost vs. Junior Analyst
Per-call and per-day economics. The per-call number is the median across mixed-provider routing — Haiku on Q&A is the lever that keeps the average down even when Opus 4.8 with reasoning runs on every disclosure check.| Scenario | Per call | Per day (125 calls) | Per 3-week earnings burst |
|---|---|---|---|
| This swarm (4 specialists + synthesizer, mixed providers) | ~$0.40 | ~$50 | ~$750 |
| One sell-side analyst, fully loaded ($300/hr blended, 1hr per call) | ~$300 | ~$2,400 (8 calls) | ~$36,000 (8/day cap) |
| Three-analyst sector pod | — | ~$7,500 (24 calls) | ~$112,500 (24/day cap) |
| Full coverage by humans | not possible | ~$37,500 of team time | ~$562,500 |
Next Steps
- Build an AI Hedge Fund Research Pipeline — the HierarchicalSwarm variant when you need a director synthesizing analyst briefs instead of fanning out in parallel
- Sell-Side Research Pipeline — composing earnings notes, cited research, and reasoning agents into a single end-of-day deliverable (planned)
- Cost Optimization Playbook — model routing, token budgeting, and batch scheduling patterns that keep the per-call number under a dollar at scale