What This Example Shows
- A
GraphWorkflowshaped as fan-out → fan-in: thousands of per-fund parser nodes run in parallel, then converge into a cross-fund aggregator and a Theme Synthesizer - How to parse raw SEC EDGAR 13F-HR filings (CIK + accession number) into structured holdings, then diff against the prior quarter
- The batch volume math for the quarterly burst: ~5,000 filings dropping in 48 hours, dispatched via
/v1/swarm/batch/completionsand priced under night-mode - Model diversification across the DAG:
gpt-4.1-minifor mechanical parsing,gpt-4.1for aggregation,claude-sonnet-4.5for the synthesis layer - Journalist-grade output: ranked sector adds/cuts, new positions crossing $100M of cumulative flow, eliminated positions, and named emerging themes with the funds driving them
- A reusable pattern for any high-volume regulatory-filing burst (Form 4, N-PORT, NPX)
This tutorial leans hard on two premium features: the batch swarm endpoint (
/v1/swarm/batch/completions) for the burst, and night-mode pricing for the burst itself. 13Fs become public after-hours on the 45-day lag — fire the batch between 11pm and 5am Pacific and you pay the night-mode rate. Read the Night-Mode Pricing Strategy guide for the schedule.Why This Matters
Form 13F-HR filings publish on a 45-day lag — every institutional manager with over $100M AUM has to disclose their long US-equity positions within 45 days of quarter-end, and almost all of them file in the same 48-hour window right at the deadline. ~5,000 filings hit EDGAR essentially simultaneously, and that window is when the entire allocator and newsroom ecosystem scrambles: Bloomberg, the FT, Institutional Investor, fund-of-funds managers, sell-side strategists, and signal traders are all racing to surface “what did the smart money buy” before the next morning’s open. The traditional shape of that work is a junior analyst manually pulling a handful of “name brand” 13Fs (Pershing Square, Tiger, Coatue, Berkshire) and writing one-off blurbs — leaving 4,990 filings unread and the cross-fund themes invisible. A GraphWorkflow turns the firehose into a structured story automatically: every filing parsed, every position diffed, every theme ranked by cumulative dollar flow across all reporting managers. By 7am Pacific on day 2 of the window, you have a brief that took a newsroom team two weeks the old way.The Architecture
Step 1: Setup
Step 2: Define the Function Tools
Every tool the parser and aggregator nodes use is declared as an OpenAI-style function tool. The model calls them, the runtime resolves them on your side.Step 3: Define the Graph Workflow Nodes
The shape is fan-out → fan-in. Each per-fund parser is a node that fetches the filing, parses the holdings table, looks up the fund profile, and diffs against the prior quarter. Multiple parsers run in parallel inside one GraphWorkflow when a filing covers multiple sub-funds under one umbrella manager. They converge into a Cross-Fund Aggregator, which feeds the Theme Synthesizer. Models are diversified across the DAG by the cognitive load of each node:- Per-fund parsers →
gpt-4.1-mini— cheap, fast, mechanical. Most of the work is tool calling against structured XML. - Cross-fund aggregator →
gpt-4.1— needs to reason over many per-fund summaries and group them coherently. - Theme synthesizer →
claude-sonnet-4.5— produces the journalist-grade narrative the brief is built around.
Step 4: Run One Fund’s 13F
Smoke-test the shape against a single filing before you fire the burst. Pershing Square’s umbrella files include the main fund plus the SPARC vehicle, which makes for a clean multi-parser fan-in.The ThemeSynthesizer output for a single filing is just that one manager’s contribution to the cross-fund picture. The real value emerges when you batch it across the full 5,000-fund universe in Step 5 — at that point the aggregator sees the entire flow and the themes pop.
Step 5: The Quarterly Burst — 5,000 Funds in One Night
Production shape: you have a feed (EDGAR’s RSS or a vendor like SEC-API.io) emitting accession numbers as they hit the system. You buffer them across the 48-hour deadline window, then fire one batch at 11pm Pacific on day 1 of the burst.gpt-4.1-mini, one aggregator on gpt-4.1, one synthesizer on claude-sonnet-4.5 — comes out to roughly $0.04 per filing under night-mode pricing. Across the full quarterly burst:
| Volume | Per-filing cost (night-mode) | Total |
|---|---|---|
| 5,000 13F-HR filings | ~$0.04 | ~$200 |
| With the night-mode 50% discount window | ~$0.02 | ~$100 |
The burst must land inside the night-mode window (11pm–5am Pacific) to hit the headline pricing. Read the Night-Mode Pricing Strategy guide for the schedule mechanics and how to chunk a larger batch across multiple windows if you exceed the rate ceiling.
Step 6: The Theme Synthesizer Output
A representative ThemeSynthesizer output for the cross-fund aggregate. This is the artifact your DB, Slack channel, and morning newsletter pull from.post_theme_brief_to_slack tool. The structured per-manager output also writes to a Postgres table so the desk can query “every fund that added NVDA this quarter, ranked by dollar size.”
Real Cost vs. Newsroom 13F Team
| Approach | Wall time | Cost per quarter | Annualized |
|---|---|---|---|
| Junior research analyst manually pulling ~50 “name brand” 13Fs | ~2 weeks of nights | — | ~$120,000 fully loaded |
| Senior markets editor reviewing and writing the brief under deadline | ~3 days each quarter | — | ~$200,000 fully loaded |
| Combined 2-person desk (junior + senior) covering ~50 funds | ~2 weeks | ~$10,000–$20,000 in burdened labor | ~$320,000+ |
| GraphWorkflow batch burst, 5,000 funds, night-mode | ~1 hour server-side | ~$100 | ~$400 |
Next Steps
- SEC Filing Triage Pipeline for the same fan-out pattern applied to 8-K, S-1, and 10-K firehoses
- Insider Form 4 Monitor for the daily-burst variant: insider transactions instead of quarterly positions
- Graph Workflows for Production Pipelines for the DAG-shape reference and conditional gating patterns
- Night-Mode Pricing Strategy to schedule the burst against the discount window