home / mcp / mcp gateway server

MCP Gateway Server

Aggregates MCP servers and provides token-optimized access, filtering, and batching across tools.

Installation
Add the following to your MCP client configuration file.

Configuration

View docs
{
  "mcpServers": {
    "abdullah1854-mcpgateway": {
      "url": "https://remote-mcp-server.com/mcp",
      "headers": {
        "HOST": "0.0.0.0",
        "PORT": "3010",
        "API_KEYS": "key1,key2",
        "AUTH_MODE": "none",
        "LOG_LEVEL": "info",
        "QDRANT_URL": "https://your-qdrant-instance.cloud",
        "CORS_ORIGINS": "http://localhost:3010",
        "GATEWAY_NAME": "mcp-gateway",
        "OAUTH_ISSUER": "https://your-oauth-provider.com",
        "ENABLE_CIPHER": "0",
        "ENABLE_SKILLS": "0",
        "ALLOW_INSECURE": "0",
        "CIPHER_API_URL": "http://localhost:8082",
        "OAUTH_AUDIENCE": "mcp-gateway",
        "OAUTH_JWKS_URI": "https://your-oauth-provider.com/.well-known/jwks.json",
        "REMOTE_API_KEY": "<YOUR_API_KEY>",
        "GATEWAY_LITE_MODE": "1",
        "ENABLE_ANTIGRAVITY": "0",
        "ENABLE_CLAUDE_USAGE": "0",
        "RATE_LIMIT_WINDOW_MS": "60000",
        "HEALTH_REQUIRE_BACKENDS": "0",
        "RATE_LIMIT_MAX_REQUESTS": "100"
      }
    }
  }
}

You can run MCP Gateway to unify and optimize access to multiple MCP backends. It aggregates tools from several MCP servers, reduces token usage, and speeds up repeated tasks by handling result filtering, batching, and on-server analytics. This lets you build AI-assisted workflows that scale across many backends while keeping token budgets in check.

How to use

Connect your client (Claude Desktop, Claude Code, Cursor, OpenAI Codex, or VS Code Copilot) to the gateway’s MCP endpoint and start routing tool calls through the gateway. You get a single, token-efficient channel to many MCP servers. Use the gateway to discover tools when your client lacks native discovery, then execute them through the gateway for on-demand loading, result filtering, and server-side aggregations. When you execute complex workflows, prefer using code execution workflows and skills to minimize prompt and data tokens and to reuse proven patterns.

How to install

Prerequisites: you need Node.js and npm installed on your machine or server.

1) Install dependencies for the gateway.

npm install

2) Configure backend MCP servers by copying the example configuration and editing it. The gateway supports multiple transports (STDIO for local servers and HTTP for remote servers). Modify the servers configuration to include your servers.

cp config/servers.example.json config/servers.json
```
```
Edit config/servers.json to add backends. Example with a local filesystem server via STDIO:
```json
{
  "servers": [
    {
      "id": "filesystem",
      "name": "Filesystem",
      "enabled": true,
      "transport": {
        "type": "stdio",
        "command": "npx",
        "args": ["-y", "@modelcontextprotocol/server-filesystem", "/path/to/dir"]
      },
      "toolPrefix": "fs"
    }
  ]
}
`
If you have a remote MCP server, you can add an HTTP transport example. Example:
```json
{
  "id": "remote-server",
  "name": "Remote MCP Server",
  "enabled": true,
  "transport": {
    "type": "http",
    "url": "https://remote-mcp-server.com/mcp",
    "headers": {
      "Authorization": "Bearer ${REMOTE_API_KEY}"
    }
  },
  "toolPrefix": "remote"
}

3) Start the gateway in development or production mode.

# Development
npm run dev

# Production
npm run build
npm start

Configuration and security notes

The gateway exposes a REST/streaming API at /mcp and a dashboard at /dashboard. Use API keys or OAuth to protect endpoints in production. You can run in insecure mode for isolated local testing, but sensitive endpoints are blocked by default unless you opt in with explicit settings.

Core environment variables influence global gateway behavior, including port, host, authentication, rate limiting, and optional features. Common values are shown in the examples, but replace them with values suitable for your environment.

Code execution and token efficiency features

Code execution mode lets you write and run code across multiple MCP tools inside a secure sandbox. This reduces token usage by enabling on-server data filtering, aggregation, and result summarization before returning results to the agent.

Skills let you save reusable code patterns for zero-shot execution. Create a skill once and execute it many times with different inputs, dramatically cutting token and latency for recurring tasks.

Progressive tool discovery loads tool schemas on demand to minimize token usage. Use the provided discovery tools to load only what you need when you need it.

Endpoints and dashboards overview

Core endpoint for clients: /mcp (HTTP Streamable). Backward compatibility: /sse. Health checks: /health. Administrative dashboard: /dashboard. Metrics: /metrics and /metrics/json.

Code execution and management endpoints are available under /api/code and /dashboard/api for tools, backends, and skills. Use these to search, load schemas, execute tools, and manage configurations.

Tips for AI agents

Start with tool discovery to minimize tokens, then load schemas only when you intend to call a tool. Use code execution and skills to batch complex workflows, and apply result filtering to trim large payloads before sending data back to the model.

Troubleshooting and quick checks

If a backend goes down, use the dashboard to reconnect or restart the backend. Verify that the gateway has access to backends and that authentication tokens are valid. Check the /health endpoint for a quick status overview.

Next steps

Enable optional features as needed, such as Skills, Cipher Memory, Antigravity usage tracking, and Claude usage tracking. Configure your environment, start the gateway, and begin routing tool calls through MCP Gateway for optimized, token-efficient tool usage.

Available tools

gateway_list_tool_names

Get all tool names with pagination

gateway_search_tools

Search by name, description, or backend with filtering

gateway_get_tool_schema

Lazy-load a specific tool schema

gateway_get_tool_schemas

Batch load multiple tool schemas

gateway_get_tool_categories

Get semantic tool categories

gateway_get_tool_tree

Get tools organized by backend

gateway_get_tool_stats

Get statistics about tools by backend

gateway_execute_code

Execute TypeScript/JavaScript in sandbox (code execution)

gateway_call_tool_filtered

Call a tool with result filtering (smart filtering)emit

gateway_call_tool_aggregate

Call a tool with aggregation for analytics

gateway_call_tools_parallel

Execute multiple tools in parallel

gateway_list_skills

List saved skills

gateway_search_skills

Search skills by name or tags

gateway_get_skill

Get skill details and code

gateway_execute_skill

Execute a saved skill with inputs

gateway_create_skill

Create a new reusable skill

gateway_get_optimization_stats

View token savings statistics

gateway_call_tool_delta

Call tool with delta responses for repeated queries

gateway_get_context_status

Monitor context window usage and warnings

gateway_call_tool_summarized

Call tool with auto-summarization of large results

gateway_analyze_code

Analyze code for optimization opportunities