/v1/agent/completions
) enables you to execute individual AI agents with specific tasks, configurations, and capabilities. This endpoint provides a flexible way to run single agents with various models, tools, and configurations.
Endpoint Information
- URL:
/v1/agent/completions
- Method:
POST
- Authentication: Required (
x-api-key
header) - Rate Limiting: Subject to tier-based rate limits
Request Schema
AgentCompletion Object
Field | Type | Required | Description |
---|---|---|---|
agent_config | AgentSpec | Yes | Configuration object for the agent |
task | string | Yes | The task or instruction for the agent to execute |
history | Union[Dict, List[Dict]] | No | Conversation history or context for the agent |
img | string | No | Single image URL for vision-enabled models |
imgs | List[string] | No | Multiple image URLs for vision-enabled models |
stream | boolean | No | Enable streaming output (default: false) |
search_enabled | boolean | No | Enable search capabilities (default: false) |
AgentSpec Object
Field | Type | Required | Default | Description |
---|---|---|---|---|
agent_name | string | Yes | - | Unique identifier for the agent |
description | string | No | - | Detailed explanation of agent’s purpose |
system_prompt | string | No | - | Initial instructions guiding agent behavior |
model_name | string | No | "gpt-4.1" | AI model to use (e.g., gpt-4o, gpt-4o-mini, claude-sonnet-4-20250514) |
auto_generate_prompt | boolean | No | false | Auto-generate prompts based on task requirements |
max_tokens | integer | No | 8192 | Maximum tokens for agent responses |
temperature | float | No | 0.5 | Controls response randomness (0.0-2.0) |
role | string | No | "worker" | Agent’s role within a system |
max_loops | integer | No | 1 | Maximum execution iterations |
tools_list_dictionary | List[Dict] | No | - | Custom tools for the agent |
mcp_url | string | No | - | MCP server URL for additional capabilities |
streaming_on | boolean | No | false | Enable streaming output |
llm_args | Dict | No | - | Additional LLM parameters (top_p, frequency_penalty, etc.) |
dynamic_temperature_enabled | boolean | No | true | Dynamic temperature adjustment |
mcp_config | MCPConnection | No | - | Single MCP connection configuration |
mcp_configs | MultipleMCPConnections | No | - | Multiple MCP connections |
tool_call_summary | boolean | No | true | Enable tool call summarization |
Response Schema
AgentCompletionOutput Object
Field | Type | Description |
---|---|---|
job_id | string | Unique identifier for the completion job |
success | boolean | Indicates successful execution |
name | string | Name of the executed agent |
description | string | Agent description |
temperature | float | Temperature setting used |
outputs | any | Generated output from the agent |
usage | Dict | Token usage and cost information |
timestamp | string | ISO timestamp of completion |
Usage Information
The response includes detailed usage metrics:Features and Capabilities
1. Multi-Model Support
- OpenAI Models: gpt-4o, gpt-4o-mini, gpt-4.1
- Anthropic Models: claude-sonnet-4-20250514-20240620
- Custom Models: Any model supported by LiteLLM
- Vision Models: Support for image analysis with gpt-4o and compatible models
2. Vision Capabilities
- Single image analysis via
img
parameter - Multiple image analysis via
imgs
parameter - Automatic image token counting and cost calculation
3. Conversation History
- Maintain context across multiple interactions
- Support for both dictionary and list-based history formats
- Automatic history formatting and token counting
4. Tool Integration
- Built-in search capabilities via
search_enabled
- MCP (Model Context Protocol) server integration
- Custom tool dictionaries
- Tool call summarization
5. Advanced Configuration
- Dynamic temperature adjustment
- Custom LLM arguments (top_p, frequency_penalty, presence_penalty)
- Streaming output support
- Auto-prompt generation
Examples
Basic Agent Execution
Agent with Conversation History
Agent with Search Capabilities
Agent with MCP Integration
Agent with Custom LLM Arguments
Batch Processing
For processing multiple agents simultaneously, use the batch endpoint: Endpoint:/v1/agent/batch/completions
Request: Array of AgentCompletion
objects (max 10 per batch)
Error Handling
The API returns appropriate HTTP status codes and error messages:- 400 Bad Request: Invalid input parameters or validation failures
- 401 Unauthorized: Missing or invalid API key
- 429 Too Many Requests: Rate limit exceeded
- 500 Internal Server Error: Server-side processing errors
Rate Limits
Rate limits are tier-based:- Free Tier: 100 requests/minute, 50 requests/hour, 50*24 requests/day
- Premium Tier: 2000 requests/minute, 10000 requests/hour, 100000 requests/day
Cost Calculation
Costs are calculated based on:- Input tokens: $4.00 per million tokens
- Output tokens: $12.50 per million tokens
- Image processing: $0.25 per image
- MCP calls: $0.10 per call
Best Practices
- Agent Naming: Use descriptive, unique names for agents
- System Prompts: Provide clear, specific instructions for consistent behavior
- Temperature Settings: Use lower values (0.1-0.3) for analytical tasks, higher values (0.7-0.9) for creative tasks
- Token Limits: Set appropriate max_tokens based on expected response length
- History Management: Keep conversation history concise to manage token costs
- Error Handling: Implement proper error handling for production applications
- Rate Limiting: Monitor usage and implement backoff strategies for rate limit handling
Integration Examples
Python SDK Usage
- pip3 install -U swarms-client
- Put your
SWARMS_API_KEY
JavaScript/Node.js Integration
Support and Resources
- API Keys: https://swarms.world/platform/api-keys
- Technical Support: https://cal.com/swarms/swarms-technical-support
- Community: Discord