What This Example Shows
- A
SequentialWorkflowthat pipes a single EDGAR accession number through classification, section extraction, diffing, scoring, and memo generation - Five OpenAI-format function tools wired into the agents: EDGAR fetch, MD&A parser, Risk Factors extractor, section differ, and a materiality scorer
- Real EDGAR ingestion including the SEC-mandated
User-Agentheader — no scraping, no hand-built parsers downstream - A diff-against-prior core: the pipeline only surfaces what changed versus the previous comparable filing (10-Q vs. prior 10-Q, 10-K vs. prior 10-K, 8-K vs. nothing)
- A materiality score (0-100) attached to every delta so analysts read top-of-stack, not chronologically
- An overnight batch run of 30-50 filings per portfolio per day against
/v1/swarm/batch/completions— the volume that forces Pro → Premium
The
/v1/swarm/batch/completions endpoint used in Step 5 is a premium feature. A live portfolio firehose at 30-50 filings/day per book quickly saturates Pro rate limits — upgrade to Premium for the parallel execution and observability (per-filing cost, per-agent token counts, structured run logs) you need to actually trust this in production. Manage your plan at https://swarms.world/platform/account.Why This Matters
Analysts do not read 10-Qs — they skim them for what changed. A 90-page Q3 filing is 87 pages of boilerplate, copy-pasted disclaimers, and last quarter’s text, plus 3 pages of new language buried somewhere in MD&A, Risk Factors, or the footnotes that actually moves the thesis. The job of this pipeline is not to summarize filings; it is to throw away the 87 pages of unchanged text, isolate the new language, score how thesis-relevant the delta is, and put the top items in front of a human in 90 seconds. A four-person credit desk covering 200 issuers cannot read every 10-Q the day it drops. This pipeline can — and it costs less than one analyst-hour per day to run the whole book.The Architecture
Step 1: Setup
.env file. SEC requires a descriptive User-Agent on every EDGAR request (see SEC EDGAR access rules) — set one that identifies your firm and a reachable email:
Step 2: Define the Function Tools
Five OpenAI-format function tools. The agents decide when to call them; the swarm runtime carries arguments and return values between stages.Step 3: Define the Pipeline Agents
Five agents in aSequentialWorkflow. Each stage builds on the prior stage’s output. Models are deliberately diversified — a cheap classifier on the front, two Claude models doing the deep legal-language work in the middle, and Sonnet writing the final memo.
Step 4: Process One Filing End-to-End
Single accession number in, materiality-scored memo out. This is the unit you batch in Step 5.Persist only the Memo Writer’s output to the research DB. The four upstream stages are the audit trail — when the PM asks “why is this scored 78?”, you can walk them back through the Diff Engine output that produced the score.
Step 5: Wire Up the EDGAR Firehose with Batch
A real portfolio sees 30-50 new filings per day across its names — earnings season pushes that to 80+. Polling EDGAR every 15 minutes and triggering one swarm per filing would saturate Pro tier rate limits by mid-morning. Batch the day’s queue in a single call.Real Cost vs. Analyst Reading Time
A buy-side analyst at a fully loaded $300K/year costs roughly $150/hour. A careful read of a 10-Q with a memo write-up is 15-20 minutes — call it $45 of analyst time per filing, and that is the analyst who already covers the name. For 8-Ks across the book that the dedicated analyst does not read, the alternative is “nothing gets read” — which is the actual failure mode this pipeline addresses.| Scenario | Pipeline cost | Analyst-time cost |
|---|---|---|
| One filing (10-Q, ~5 sections diffed) | ~$0.30 | ~$45 |
| Daily run (40 filings across the book) | ~$12 | ~$1,800 |
| Annualized (250 trading days) | ~$3,000 | ~$450,000 |
Next Steps
- Pair this with the M&A Due Diligence Swarm when a triage memo flags a strategic-review or transaction disclosure
- Feed high-materiality memos into the Sell-Side Research Pipeline for full-length write-ups
- Read the Cost Optimization Playbook for tuning model selection per stage once the pipeline is in production