home / mcp / mcp local rag server
"primitive" RAG-like web search model context protocol (MCP) server that runs locally. ✨ no APIs ✨
Configuration
View docs{
"mcpServers": {
"nkapila6-mcp-local-rag": {
"command": "uvx",
"args": [
"--python=3.10",
"--from",
"git+https://github.com/nkapila6/mcp-local-rag",
"mcp-local-rag"
],
"env": {
"DOCKER_CONTAINER": "true"
}
}
}
}The mcp-local-rag server runs entirely locally to provide multi-engine, zero API-key research and web context extraction for large language models. It integrates several search backends, ranks results by semantic relevance, and returns curated context that you can use to augment LM outputs without external dependencies.
To use this MCP server with an MCP client, ensure your client supports tool calling and is configured to communicate with MCP servers. Start the local server through your preferred method, then initiate a query from your language model. The server will search multiple backends, fetch embeddings, rank results by relevance, extract context from top results, and return Markdown-formatted content that your model can incorporate into the final answer.
Prerequisites: you need Docker installed or you can run directly with uvx. Ensure your environment can run Python 3.10 if you choose the uvx path.
{
"mcpServers": {
"mcp-local-rag":{
"command": "uvx",
"args": [
"--python=3.10",
"--from",
"git+https://github.com/nkapila6/mcp-local-rag",
"mcp-local-rag"
]
}
}
}{
"mcpServers": {
"mcp-local-rag": {
"command": "docker",
"args": [
"run",
"--rm",
"-i",
"--init",
"-e",
"DOCKER_CONTAINER=true",
"ghcr.io/nkapila6/mcp-local-rag:v1.0.2"
]
}
}
}Security audits are performed on MCP servers to help you verify safety and compliance. You can review the audit results for this server to understand any potential vulnerabilities or exposure points.
MCP clients that support tool calling, such as Claude Desktop, Cursor, Goose, and others, should work with this server. Ensure your client is configured to send tool calls to the MCP endpoint you start in your environment.
Examples on Claude Desktop demonstrate how a model can request real-time web information, trigger mcp-local-rag, and receive a sourced, context-rich response. The server fetches live results, extracts context, and returns it to the model for final composition.
Contributing and experimentation are welcome. This project uses the MIT license and encourages improvements, issue reporting, and pull requests to enhance functionality.
Comprehensive multi-engine research across multiple backends to gather diverse perspectives on a topic.
Google-focused deep dive that leverages Google's index for technical or scientific queries.
Privacy-focused deep research using DuckDuckGo for broad, private results.
Quick, single searches using DuckDuckGo for fast answers.
Fast, single searches using Google for rapid information retrieval.
Agent Skill that guides Claude to apply best practices for multi-engine research and privacy-aware querying.