Skip to main content

What This Example Shows

  • Using the Swarms API as a drop-in OpenAI replacement with zero code changes beyond base_url and api_key
  • Non-streaming and streaming completions
  • Multi-turn conversations with conversation history
  • Sending images (multimodal / vision)
  • Multi-loop agent reasoning via max_loops
  • Error handling with SDK-native exception classes
  • Working examples in Python, TypeScript, Go, and Rust
This endpoint uses the standard OpenAI request/response schema. Any existing OpenAI SDK code works by changing two config values. See the API Reference for the full schema.

Prerequisites

pip install openai python-dotenv
Set your API key as an environment variable:
export SWARMS_API_KEY="your-api-key-here"

1. Basic Chat Completion

The simplest usage — send a message, get a response.
import os
from openai import OpenAI
from dotenv import load_dotenv

load_dotenv()

client = OpenAI(
    api_key=os.environ["SWARMS_API_KEY"],
    base_url="https://api.swarms.world/v1",
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "You are a senior financial analyst."},
        {
            "role": "user",
            "content": "Summarize the key risks of investing in emerging market bonds.",
        },
    ],
    max_tokens=512,
    temperature=0.3,
)

print(response.choices[0].message.content)
print(f"\nUsage: {response.usage.prompt_tokens} in / {response.usage.completion_tokens} out")

Expected Output

Emerging market bonds carry several key risks:

1. **Currency Risk** — Local currency bonds can lose value when the
   issuing country's currency depreciates against the investor's home currency...
2. **Political and Sovereign Risk** — ...
3. **Liquidity Risk** — ...

Usage: 38 in / 247 out

2. Streaming Responses

Stream the response as it’s generated for a better user experience on longer outputs.
import os
from openai import OpenAI
from dotenv import load_dotenv

load_dotenv()

client = OpenAI(
    api_key=os.environ["SWARMS_API_KEY"],
    base_url="https://api.swarms.world/v1",
)

print("Agent: ", end="", flush=True)

stream = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "You are a creative fiction writer."},
        {"role": "user", "content": "Write a 3-paragraph short story about a robot discovering music for the first time."},
    ],
    max_tokens=1024,
    temperature=0.8,
    stream=True,
)

for chunk in stream:
    content = chunk.choices[0].delta.content
    if content:
        print(content, end="", flush=True)

print("\n\n--- Stream complete ---")

3. Multi-Turn Conversation (Chatbot)

Build a conversational chatbot by accumulating messages across turns.
import os
from openai import OpenAI
from dotenv import load_dotenv

load_dotenv()

client = OpenAI(
    api_key=os.environ["SWARMS_API_KEY"],
    base_url="https://api.swarms.world/v1",
)

messages = [
    {
        "role": "system",
        "content": (
            "You are a helpful coding assistant. "
            "When the user asks a question, provide clear, concise answers with code examples."
        ),
    },
]

# Turn 1
messages.append({"role": "user", "content": "How do I read a JSON file in Python?"})
response = client.chat.completions.create(model="gpt-4o", messages=messages)
assistant_reply = response.choices[0].message.content
messages.append({"role": "assistant", "content": assistant_reply})
print(f"Assistant: {assistant_reply}\n")

# Turn 2 — follows up on the first answer
messages.append({"role": "user", "content": "How do I handle the case where the file doesn't exist?"})
response = client.chat.completions.create(model="gpt-4o", messages=messages)
assistant_reply = response.choices[0].message.content
messages.append({"role": "assistant", "content": assistant_reply})
print(f"Assistant: {assistant_reply}\n")

# Turn 3 — references context from both prior turns
messages.append({"role": "user", "content": "Now combine both into a single reusable function."})
response = client.chat.completions.create(model="gpt-4o", messages=messages)
assistant_reply = response.choices[0].message.content
print(f"Assistant: {assistant_reply}")

4. Vision (Image Input)

Send an image alongside your prompt using the multimodal content format.
import os
from openai import OpenAI
from dotenv import load_dotenv

load_dotenv()

client = OpenAI(
    api_key=os.environ["SWARMS_API_KEY"],
    base_url="https://api.swarms.world/v1",
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "What is shown in this image? Describe it in detail."},
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "https://upload.wikimedia.org/wikipedia/commons/thumb/3/3a/Cat03.jpg/1200px-Cat03.jpg"
                    },
                },
            ],
        }
    ],
    max_tokens=512,
)

print(response.choices[0].message.content)

5. Multi-Loop Reasoning

Use max_loops to let the agent iterate on its own output — useful for complex analysis, self-correction, or multi-step reasoning. Pass it via extra_body in the OpenAI SDK.
import os
from openai import OpenAI
from dotenv import load_dotenv

load_dotenv()

client = OpenAI(
    api_key=os.environ["SWARMS_API_KEY"],
    base_url="https://api.swarms.world/v1",
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {
            "role": "system",
            "content": (
                "You are a rigorous code reviewer. "
                "First write a solution, then review it for bugs, "
                "then provide the final corrected version."
            ),
        },
        {
            "role": "user",
            "content": "Write a Python function that finds the longest palindromic substring in a string.",
        },
    ],
    max_tokens=2048,
    extra_body={"max_loops": 3},
)

print(response.choices[0].message.content)
print(f"\nTokens: {response.usage.total_tokens}")
max_loops is a Swarms extension — not part of the OpenAI spec. Default is 1 (single pass). The agent runs the specified number of reasoning loops, refining its output each iteration.

6. Putting It All Together — Research Assistant

A complete example that combines system prompts, multi-turn conversation, and streaming to build a simple research assistant.
import os
from openai import OpenAI
from dotenv import load_dotenv

load_dotenv()

client = OpenAI(
    api_key=os.environ["SWARMS_API_KEY"],
    base_url="https://api.swarms.world/v1",
)

SYSTEM_PROMPT = """You are a research assistant specializing in technology trends.
When asked about a topic:
1. Provide a concise overview
2. List 3-5 key developments
3. Identify potential implications
4. Cite timeframes where relevant
Be specific and data-oriented. Avoid vague generalities."""

messages = [{"role": "system", "content": SYSTEM_PROMPT}]


def ask(question: str) -> str:
    """Send a question and stream the response."""
    messages.append({"role": "user", "content": question})

    stream = client.chat.completions.create(
        model="gpt-4o",
        messages=messages,
        max_tokens=1024,
        temperature=0.3,
        stream=True,
    )

    full_response = []
    for chunk in stream:
        content = chunk.choices[0].delta.content
        if content:
            print(content, end="", flush=True)
            full_response.append(content)

    reply = "".join(full_response)
    messages.append({"role": "assistant", "content": reply})
    print("\n")
    return reply


# Research session
print("=== Research Session: AI Chip Industry ===\n")

ask("What is the current state of the AI chip market in 2025?")
ask("Which startups are challenging NVIDIA's dominance, and what approaches are they taking?")
ask("Based on what you've told me, which of these challengers has the strongest technical moat?")

Environment Setup

Create a .env file in your project directory:
SWARMS_API_KEY=your_api_key_here

Next Steps

  • API Reference — full request/response schema and field documentation
  • Agent Completions — the native Swarms endpoint with tools, MCP, and multi-loop support
  • Streaming — streaming with the native Swarms agent endpoint
  • Vision — more image/multimodal examples