What This Covers
- How the four observability signals fit together: per-request logs, daily usage rollup, credit balance, and live rate-limit headers
- A small Python script that pulls the last 24 hours of activity, aggregates by
swarm_type, and prints a cost-and-error breakdown - The audit-trail story for enterprise, healthcare, finance, and other regulated workloads
- Which signal answers which operational question — and which one you should be pulling first
Why This Matters
Most teams discover their observability gap the day a customer asks “what did the agent see on Tuesday at 2pm and how much did it cost us?” The Swarms API exposes everything you need to answer that — but the data is split across four endpoints with different shapes, time grains, and refresh cadences. This guide is the operator narrative on top of the Swarm Logs, Usage Report, and Account Credits reference pages: which signal to use when, how to combine them, and the minimum production-grade script for a daily dashboard.The Four Signals
| Signal | Endpoint / Source | Time grain | Answers |
|---|---|---|---|
| Per-request logs | GET /v1/swarm/logs | Per request | ”What ran? Did it succeed? How much did it cost?” |
| Daily usage rollup | GET /v1/usage/report | Per day | ”What’s our trend? Are we forecasting under budget?” |
| Credit balance | GET /v1/account/credits | Snapshot | ”Do we have headroom for the next batch job?” |
| Rate-limit headers | X-RateLimit-* on every response | Per request | ”Are we about to get 429’d? Do we need to back off?” |
Step 1: Configure the Client
Step 2: Pull Last-24h Logs, Aggregated by Swarm Type
The single most useful operator script: what ran in the last 24 hours, grouped byswarm_type, with cost and error counts per group.
The exact shape of each log entry can vary slightly —
swarm_type, agent_name, and model_name may appear at the top level or nested under data. The code above is defensive against both. See the Swarm Logs reference for the full schema.Step 3: Reconcile Logs Against the Daily Rollup
Logs are per-request; the usage report is the authoritative daily rollup. They should agree within rounding. Use this to catch missing log entries from high-volume periods.Step 4: Check Credits Before a Batch Job
The cheapest production incident to avoid is “the batch job stopped halfway because credits ran out.” One call before the submit loop is enough.Step 5: Watch Rate-Limit Headers in Flight
The headers are on every authenticated response, including errors. You do not need a separate call. Log them after every request and feed the data into your throttling logic.The Audit-Trail Value
For enterprise and regulated workloads — healthcare, financial services, legal, defense — the per-request log is not a nice-to-have. It’s the artifact your compliance team needs to answer post-hoc questions like:- “Show every agent invocation that touched patient X’s data between March 1 and March 15.”
- “Reconstruct the chain of agent outputs that produced this trade recommendation.”
- “Produce the model name, system prompt, and output for the decision made at 14:32 UTC.”
/v1/swarm/logs endpoint is filtered to your API key and excludes client IP addresses for privacy, but otherwise retains the request shape, the model invoked, the response time, and the cost. Combined with deterministic agent configs (low temperature, pinned model names, fixed max_loops), it gives you a reproducible record per agent call — which is what most regulators actually want.
The platform’s existing log retention is suitable for debugging and operational analytics. For workloads with formal retention requirements (GxP, HIPAA, SOX, SR 11-7), export logs to your own storage on a daily cadence — the Swarm Logs examples show CSV/JSON/compressed export patterns.
Putting It Together: Daily Operator Cron
A pragmatic daily cron looks like this:- 00:05 UTC — pull
/v1/usage/report?period=dayfor yesterday; record total cost in your warehouse - 00:10 UTC — pull
/v1/swarm/logs; archive yesterday’s entries to S3 / your log lake; aggregate byswarm_typeandmodel_namefor finance - 00:15 UTC — pull
/v1/account/credits; alert iftotal_credits < daily_budget * 7 - Continuously — every production request logs its
X-RateLimit-Remaining-Minute; alert if a rolling 5-minute average drops below 20% ofX-RateLimit-Limit-Minute
Next Steps
- Read Swarm Logs & API History for filtering, export, and the full log schema
- Read Usage Report for daily-rollup query parameters and the response schema
- Read Rate Limit Headers for tier thresholds and the full header set
- Read the Production Readiness Checklist to wire these signals into a complete production wrapper