通过mcp外挂知识库
Configuration
View docs{
"mcpServers": {
"kalicyh-mcp-rag": {
"url": "http://127.0.0.1:8060/mcp"
}
}
}You have a low-latency Retrieval-Augmented Generation (RAG) service built on the MCP protocol. It provides fast local knowledge retrieval, supports raw or summarized retrieval modes, and integrates with local or remote LLM providers for smart summarization and context handling. This server is designed for modular expansion, asynchronous optimization, and easy management of data sources and queries through a unified MCP interface.
You work with an MCP client to fetch knowledge, perform queries, and receive results from the MCP-RAG service. Start the service, connect via the MCP endpoint, and use the web-based configuration and document management pages to upload data, customize settings, and run queries.
Prerequisites you need before installation are clearly defined to ensure the service runs smoothly.
# Prerequisites
- Python >= 3.13
- uv package manager
# Basic installation (cloud API only)
uv sync
# Optional: enable local embeddings (e.g., m3e-small, e5-small)
uv sync --extra local-embeddingsMCP-RAG uses a JSON file for persistent configuration. The file stores host, port, vector database settings, embedding/provider configurations, LLM provider details, and feature toggles. You can modify this file through the web configuration page after the server starts.
Key defaults include using a local Chroma vector store, Doubao as the embedding provider, and options to enable or disable LLM summaries and caching. The server exposes a management interface for uploading documents and performing queries.
{
"host": "0.0.0.0",
"port": 8060,
"http_port": 8060,
"debug": false,
"vector_db_type": "chroma",
"chroma_persist_directory": "./data/chroma",
"qdrant_url": "http://localhost:6333",
"embedding_provider": "zhipu",
"embedding_device": "cpu",
"embedding_cache_dir": null,
"provider_configs": {
"doubao": {
"base_url": "https://ark.cn-beijing.volces.com/api/v3",
"model": "doubao-embedding-text-240715",
"api_key": null
},
"zhipu": {
"base_url": "https://open.bigmodel.cn/api/paas/v4",
"model": "embedding-3",
"api_key": null
}
},
"llm_provider": "doubao",
"llm_model": "doubao-seed-1.6-250615",
"llm_base_url": "https://ark.cn-beijing.volces.com/api/v3",
"llm_api_key": null,
"enable_llm_summary": false,
"enable_thinking": true,
"max_retrieval_results": 5,
"similarity_threshold": 0.7,
"enable_reranker": false,
"enable_cache": false
}Configure the MCP client to connect to the RAG server using the MCP endpoint. The example shows how to reference the RAG server from an MCP client.
{
"mcpServers": {
"rag": {
"url": "http://127.0.0.1:8060/mcp"
}
}
}