home / mcp / mcp local rag server

MCP Local Rag Server

"primitive" RAG-like web search model context protocol (MCP) server that runs locally. ✨ no APIs ✨

Installation

Add the following to your MCP client configuration file.

Configuration

{
  "mcpServers": {
    "nkapila6-mcp-local-rag": {
      "command": "uvx",
      "args": [
        "--python=3.10",
        "--from",
        "git+https://github.com/nkapila6/mcp-local-rag",
        "mcp-local-rag"
      ],
      "env": {
        "DOCKER_CONTAINER": "true"
      }
    }
  }
}

The mcp-local-rag server runs entirely locally to provide multi-engine, zero API-key research and web context extraction for large language models. It integrates several search backends, ranks results by semantic relevance, and returns curated context that you can use to augment LM outputs without external dependencies.

How to use

To use this MCP server with an MCP client, ensure your client supports tool calling and is configured to communicate with MCP servers. Start the local server through your preferred method, then initiate a query from your language model. The server will search multiple backends, fetch embeddings, rank results by relevance, extract context from top results, and return Markdown-formatted content that your model can incorporate into the final answer.

How to install

Prerequisites: you need Docker installed or you can run directly with uvx. Ensure your environment can run Python 3.10 if you choose the uvx path.

Run via uvx (direct local start)

{
  "mcpServers": {
    "mcp-local-rag":{
      "command": "uvx",
      "args": [
        "--python=3.10",
        "--from",
        "git+https://github.com/nkapila6/mcp-local-rag",
        "mcp-local-rag"
      ]
    }
  }
}

Run via Docker (recommended)

{
  "mcpServers": {
    "mcp-local-rag": {
      "command": "docker",
      "args": [
        "run",
        "--rm",
        "-i",
        "--init",
        "-e",
        "DOCKER_CONTAINER=true",
        "ghcr.io/nkapila6/mcp-local-rag:v1.0.2"
      ]
    }
  }
}

Additional sections

Security audits are performed on MCP servers to help you verify safety and compliance. You can review the audit results for this server to understand any potential vulnerabilities or exposure points.

MCP clients that support tool calling, such as Claude Desktop, Cursor, Goose, and others, should work with this server. Ensure your client is configured to send tool calls to the MCP endpoint you start in your environment.

Examples on Claude Desktop demonstrate how a model can request real-time web information, trigger mcp-local-rag, and receive a sourced, context-rich response. The server fetches live results, extracts context, and returns it to the model for final composition.

Contributing and experimentation are welcome. This project uses the MIT license and encourages improvements, issue reporting, and pull requests to enhance functionality.

Available tools

deep_research

Comprehensive multi-engine research across multiple backends to gather diverse perspectives on a topic.

deep_research_google

Google-focused deep dive that leverages Google's index for technical or scientific queries.

deep_research_ddgs

Privacy-focused deep research using DuckDuckGo for broad, private results.

rag_search_ddgs

Quick, single searches using DuckDuckGo for fast answers.

rag_search_google

Fast, single searches using Google for rapid information retrieval.

local_rag_search_skill

Agent Skill that guides Claude to apply best practices for multi-engine research and privacy-aware querying.