home / mcp / fastmcp - model context protocol server

FastMCP - Model Context Protocol Server

custom mcp server

Installation
Add the following to your MCP client configuration file.

Configuration

View docs
{
  "mcpServers": {
    "dev00355-custom-mcp": {
      "command": "python",
      "args": [
        "mcp_server.py"
      ],
      "env": {
        "LLM_SERVICE_API_KEY": "YOUR_API_KEY",
        "LOCAL_LLM_SERVICE_URL": "http://localhost:5001"
      }
    }
  }
}

FastMCP is a Model Context Protocol server that connects MCP clients to your local LLM service. It enables seamless, MCP-compliant interactions with options for streaming responses, ready-made prompts, and health checks, making it easy to integrate an OpenAI-compatible LLM into MCP-enabled applications.

How to use

You run the MCP server locally and connect MCP-enabled clients to it. The server exposes a stdio interface and reads environment configuration to forward requests to your LLM service running on your machine. Clients can perform chat completions, list available models, and run health checks. You can also leverage pre-built prompts for common tasks like assistant workflows, code review, and text summarization.

How to install

Prerequisites you need before starting

  • Python 3.9+
  • pip
  • Local LLM service running on port 5001 (OpenAI-compatible API)
  • MCP client (for example Claude Desktop or MCP Inspector)

1. Set up a Python virtual environment and activate it

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

2. Install dependencies

pip install -r requirements.txt

3. Create and configure environment variables

# Server Settings
MCP_SERVER_NAME=fastmcp-llm-router
MCP_SERVER_VERSION=0.1.0

# LLM Service Configuration
LOCAL_LLM_SERVICE_URL=http://localhost:5001

# Optional: API Key for LLM service
# LLM_SERVICE_API_KEY=your_api_key_here

# Timeouts (in seconds)
LLM_REQUEST_TIMEOUT=60
HEALTH_CHECK_TIMEOUT=10

# Logging
LOG_LEVEL=INFO

4. Run the MCP server

# Option 1: Using the CLI script
python run_server.py

# Option 2: Direct execution
python mcp_server.py

# Option 3: With custom configuration
python run_server.py --llm-url http://localhost:5001 --log-level DEBUG

Additional configuration and notes

The MCP server is designed to be connected to by MCP clients via stdio. You can specify the LLM service URL through environment variables and run the server with a direct Python command. Ensure your LLM service is reachable at the configured URL and that timeouts are set to reasonable values for your environment.

Security and troubleshooting

- Keep your LLM service URL and any API keys protected. Use environment variables to avoid hard-coding sensitive values.

- If requests fail, check that the LLM service is running and accessible at the URL you configured. Verify the network path and port.

Available tools

chat_completion

Send messages to your LLM service to generate a reply based on a conversation history, with options for model, temperature, and streaming.

list_models

Query your LLM service to retrieve the list of available models.

health_check

Verify that the LLM service is reachable and responsive.