home / mcp / py-mcp-qdrant-rag mcp server
Provides an end-to-end RAG system with Qdrant-backed semantic search and support for Ollama or OpenAI embeddings.
Configuration
View docs{
"mcpServers": {
"amornpan-py-mcp-qdrant-rag": {
"command": "/path/to/conda/envs/mcp-rag-qdrant-1.0/bin/python",
"args": [
"/path/to/py-mcp-qdrant-rag/run.py",
"--mode",
"mcp"
],
"env": {
"OLLAMA_URL": "http://localhost:11434",
"QDRANT_URL": "http://localhost:6333",
"OPENAI_API_KEY": "sk-your-openai-api-key",
"EMBEDDING_PROVIDER": "ollama"
}
}
}
}This MCP server enables Retrieval-Augmented Generation using a Qdrant vector store with flexible embeddings. You index documents, search semantically, and query your knowledge base with natural language through an MCP client, such as Claude Desktop. It supports local Ollama embeddings or OpenAI embeddings, and it can ingest a variety of document formats and web content for fast, contextual answers.
You run the MCP server locally and connect to it from your MCP client. Start by configuring the MCP server with the embedding provider you prefer, then index your sources (web pages or local documents). Use natural language queries to retrieve relevant document chunks and get concise, contextual answers from your vector store.
# Prerequisites
# - Python 3.11+
# - Conda (Miniconda or Anaconda)
# - Qdrant vector database (docker or cloud)
# - Ollama for local embeddings OR OpenAI API key
# - Claude Desktop (client) installed
# 1. Clone the MCP server repo
git clone https://github.com/amornpan/py-mcp-qdrant-rag.git
cd py-mcp-qdrant-rag
# 2. Create and activate a Conda environment
conda create -n mcp-rag-qdrant-1.0 python=3.11
conda activate mcp-rag-qdrant-1.0
# 3. Install Python client for Ollama (if using Ollama)
pip install ollama
# 4. Pull Ollama embedding model (local)
ollama pull nomic-embed-text
# 5. Start Qdrant (examples)
# Using Docker (in one terminal):
# docker run -p 6333:6333 -v $(pwd)/qdrant_storage:/qdrant/storage qdrant/qdrant
# 6. Run the MCP server in MCP mode
python run.py --mode mcpConfiguration and runtime details are provided below. You can run the MCP server in two common ways: by pointing to a local Python runtime via Conda and using Ollama for embeddings, or by switching to OpenAI embeddings with your API key. The following sections include concrete examples you can copy and adapt to your environment.
{
"mcpServers": {
"mcp-rag-qdrant-1.0-ollama": {
"type": "stdio",
"command": "/path/to/conda/envs/mcp-rag-qdrant-1.0/bin/python",
"args": [
"/path/to/py-mcp-qdrant-rag/run.py",
"--mode",
"mcp"
],
"env": {
"QDRANT_URL": "http://localhost:6333",
"EMBEDDING_PROVIDER": "ollama",
"OLLAMA_URL": "http://localhost:11434"
}
},
"mcp-rag-qdrant-1.0-openai": {
"type": "stdio",
"command": "/path/to/conda/envs/mcp-rag-qdrant-1.0/bin/python",
"args": [
"/path/to/py-mcp-qdrant-rag/run.py",
"--mode",
"mcp"
],
"env": {
"QDRANT_URL": "http://localhost:6333",
"EMBEDDING_PROVIDER": "openai",
"OPENAI_API_KEY": "sk-your-openai-api-key-here"
}
}
}
}Keep API keys and private URLs out of public access. Ensure Qdrant and embedding services are accessible only from trusted networks. Use secure storage for your API keys and rotate them as needed.
If you encounter issues, verify that the Python runtime in your Conda environment is correctly referenced, and confirm that the Qdrant service is running at the URL specified in QDRANT_URL. Check Ollama or OpenAI connectivity depending on the chosen embedding provider. Restart Claude Desktop after any configuration changes.
The MCP server processes documents from various formats, supports web scraping, and stores embeddings in Qdrant for fast semantic searching. You can index local directories or individual URLs, and then query the knowledge base using natural language prompts.
The MCP server exposes functions to add documentation from URLs, add local directories, search the documentation, and list sources. These enable end-to-end indexing and retrieval workflows.
Add documentation from a web URL to the vector database.
Recursively add all supported files from a directory.
Search through stored documentation using semantic similarity.
List all documentation sources in the database.