What This Example Shows
- A
HierarchicalSwarmwith a Triage Manager routing tickets to one of five specialists: Billing, Technical, Account, Product Liaison, Retention - A Quality Reviewer agent that scores every draft for tone and confidence, then sets an escalation flag
- A complete
/v1/swarm/completionscall for a realistic ticket — subject, body, and suggested next action returned as a structured reply - A Flask webhook that catches new Intercom conversations, runs the swarm, and either auto-replies via the Intercom REST API or drops the ticket into a human queue
- A
/v1/swarm/batch/completionspattern for clearing a 500-ticket overnight backlog for under $10 - The cost math: ~$0.02 per ticket vs. ~$1.50 for a fully loaded tier-1 agent
This tutorial uses
HierarchicalSwarm and /v1/swarm/batch/completions — both included on every paid Swarms tier. For production-volume webhook traffic and overnight batches above a few thousand tickets, upgrade at https://swarms.world/platform/account for the parallel execution and rate-limit headroom you need.Why This Matters
A fully loaded tier-1 support agent runs ~$50K/yr and clears about 30 tickets a day. A swarm running on this stack handles 30,000 tickets a day for under $200. The job is not to fire the support team — it is to put a customer-ready, on-brand drafted reply in front of your humans so they approve-and-send in five seconds instead of writing from scratch in five minutes. That single shift takes your team from drowning in queue to ahead of SLA, and it does the boring 80% of tickets autonomously so the humans get their afternoon back for the cases that actually need them. Every SaaS company on Earth has this problem and almost none of them have shipped a real fix.The Architecture
Step 1: Setup
Install the dependencies and grab your API key from https://swarms.world/platform/api-keys.Step 2: Define the Triage + Specialist + Quality Team
Seven agents. The Triage Manager owns routing. Each specialist drafts a customer-ready reply in a strict format the Quality Reviewer can grade. The Quality Reviewer decides whether the draft auto-sends or goes to a human.The Triage Manager runs on
gpt-4.1 because misrouting cascades into the wrong specialist and burns the whole pipeline. Specialists run on gpt-4.1-mini — they are doing template-shaped writing and the cost difference is what makes $0.02/ticket possible. The Quality Reviewer goes back to gpt-4.1 because the escalation decision is load-bearing.Step 3: Process a Single Ticket
Build a realistic ticket, post it to/v1/swarm/completions, and pull the Quality Reviewer’s JSON out of the response.
escalate_to_human is the only field that decides what happens next — every downstream branch reads from there.
Step 4: Wire It to Intercom (or Zendesk) via Webhook
Drop the swarm behind a Flask webhook. Intercom POSTs every new conversation to your endpoint, you run the swarm, and you either reply via the Intercom REST API or assign the conversation to a human teammate based on the escalation flag.For Zendesk, swap the Intercom calls for the equivalent endpoints —
POST /api/v2/tickets/{id}/comments for the reply and PUT /api/v2/tickets/{id}.json with {"ticket": {"assignee_id": <human_id>}} for the escalation. The swarm layer does not change.Step 5: Batch Mode for Existing Backlog
When the team comes back from a long weekend with 500 tickets stacked up, you do not run the webhook 500 times — you fan them out through/v1/swarm/batch/completions and clear the queue overnight.
Real Cost vs. Tier-1 Support
| Scenario | Cost per ticket | Cost per month (10K tickets) | Throughput |
|---|---|---|---|
| Customer support swarm (gpt-4.1 + 4.1-mini) | ~$0.02 | ~$200 | minutes per batch |
| Tier-1 agent (fully loaded ~$50k, ~30/day) | ~$1.50 | ~$15,000 | bounded by headcount |
| BPO outsourced floor | ~$0.80–$2.50 | ~$8,000–$25,000 | hours to days |
Guardrails
These rules belong in code, not in the prompt — the prompt is a soft constraint, the code is hard.- Never auto-send for refunds above $X. Hard-cap it. Read the dollar amount out of
suggested_next_actionwith a regex and force escalation if it exceeds your finance team’s pre-approved threshold. - Never auto-send for legal, compliance, or media topics. Match on keywords (
lawsuit,attorney,GDPR,HIPAA,breach,press,journalist) and force escalation regardless of confidence. - Always include the original ticket ID in the reply. Customers reference it, your support team searches by it, your audit log requires it. The specialist prompts include it but verify in code before sending.
- Rate-limit auto-replies per customer. If you have already auto-sent two replies on a thread, the third one is escalated by default — a customer who is still responding usually needs a human.
- Log every swarm decision. Persist the Quality Reviewer JSON, the specialist who drafted, and the model usage to your warehouse. The first time you debate “is the swarm getting better or worse,” that table is the only thing that matters.
- Run a shadow week before going live. Send every ticket through the swarm, drop every result into an internal note, never auto-send. Your support leads grade a sample of 200 drafts. Ship when the approval rate clears the bar you set.
Next Steps
- See the Hierarchical Workflow Example for the director-and-workers pattern in more depth
- Read Tools in Swarms to give the Billing Specialist a real Stripe refund tool and the Technical Engineer a real log-search tool
- Browse Batch Swarm Completions for the underlying batch endpoint mechanics and rate-limit guidance