home / mcp / litellm agent mcp server
Provides access to 100+ LLMs via LiteLLM’s unified API for AI agents, enabling model calls, comparisons, and recommendations.
Configuration
View docs{
"mcpServers": {
"berriai-litellm-agent-mcp": {
"command": "python",
"args": [
"-m",
"litellm_agent_mcp"
],
"env": {
"OPENAI_API_KEY": "sk-...",
"ANTHROPIC_API_KEY": "sk-..."
}
}
}
}This MCP server lets your AI agents access 100+ language models through LiteLLM’s unified API. It enables you to route tasks to the most suitable model, compare outputs, and optimize costs by selecting the best model for each job.
You will run the MCP server locally and connect your agent to it. Your agent can call any supported model, compare results across models, and request recommendations tailored to tasks like coding, writing, or long-form content. Use the available tools to perform specific actions such as calling a model, comparing outputs, or querying model capabilities.
Prerequisites: ensure you have Python installed on your system.
Option 1: Install via PyPI and run the MCP server as a Python module.
pip install litellm-agent-mcp
```
```bash
python -m litellm_agent_mcpTo start the MCP server with your preferred API keys, you can configure the server as shown in the example below. This configuration uses the Python runtime to launch the LiteLLM MCP server and exposes environment variables for the API keys you provide.
{
"mcpServers": {
"litellm": {
"command": "python",
"args": ["-m", "litellm_agent_mcp"],
"env": {
"OPENAI_API_KEY": "sk-...",
"ANTHROPIC_API_KEY": "sk-..."
}
}
}
}Keep your API keys secure. Do not share keys in public configurations. Use environment variables or secret management to protect keys used by the MCP server.
Call a specific model for code explanations, compare multiple models for a writing task, or obtain a model recommendation for a given task type. The server exposes tools to perform these actions and return structured outputs.
If the server fails to start, verify you have Python installed, the required dependencies, and that the environment variables are correctly set. Check for port conflicts or missing API keys, and consult the runtime logs for any error messages.
Call any LLM model using the OpenAI chat completions format.
Use OpenAI Responses API format for stateful, structured outputs.
Use Anthropic Messages API format for Claude-style interactions.
Use Google generateContent format for Gemini-style models.
Compare responses across multiple models and identify the best result.
List available models and their strengths.
Get a model recommendation based on a task type.