home / mcp / crawl4ai mcp server
Provides web search, crawling, and Retrieval Augmented Generation for AI agents and coding assistants.
Configuration
View docs{
"mcpServers": {
"ai-enthusiasts-crawl4ai-rag-mcp": {
"command": "docker",
"args": [
"exec",
"-i",
"crawl4aimcp-mcp-1",
"uv",
"run",
"python",
"src/main.py"
],
"env": {
"USE_KNOWLEDGE_GRAPH": "true"
}
}
}
}You deploy a self-contained MCP server that combines CRAWL4AI, SearXNG, and Supabase to enable AI agents and coding assistants to search the web, crawl content, store embeddings, and run retrieval-augmented generation workflows. Itβs designed to be Docker-based with zero Python environment setup and an integrated, production-ready stack for fast, private web intelligence.
You interact with the MCP server through an MCP client to perform search, crawl, and RAG tasks. Start the stack, connect your client, and run workflows that query the built-in SearXNG search, scrape and crawl content, create vector embeddings, and execute RAG queries. You can also leverage advanced RAG strategies such as contextual embeddings, hybrid search, and agentic RAG for code examples.
Typical usage patterns include: initiating a search to discover relevant pages, triggering automated scraping of found URLs, storing content as embeddings, and then requesting RAG-processed results that focus on the most relevant chunks. You can also run a full workflow that starts from a query, gathers URLs, scrapes content, stores it, and returns either semantically organized results or raw Markdown content depending on your needs.
Prerequisites to run the MCP server locally or in a development environment are Docker and Docker Compose, with Make available for convenience commands. Ensure you have at least 8 GB of RAM for production-style workloads.
1) Clone the project repository and navigate into it.
2) Start the stack in production mode to run all services together.
make prod # Starts all services in production mode3) If you are developing, you can start services with hot reloading and debug logging.
make dev # Starts services with hot reloading and debug loggingThis MCP server includes a minimal, explicit example for connecting your MCP client (such as Claude Desktop) to run the MCP server inside a container. The example shows how to invoke the MCP runtime from a Docker container and set an environment variable to enable knowledge graph features.
{
"mcpServers": {
"crawl4ai-mcp": {
"command": "docker",
"args": [
"exec", "-i", "crawl4aimcp-mcp-1",
"uv", "run", "python", "src/main.py"
],
"env": {
"USE_KNOWLEDGE_GRAPH": "true"
}
}
}
}The server ships with a rich set of tools to perform web search, crawling, and RAG processing. Core tools include URL scraping, smart crawling, listing available sources, and performing RAG queries. A comprehensive, integrated search tool performs end-to-end workflows from search to RAG results.
If services fail to start, verify the container state and view logs for each service. You can inspect running containers and examine their logs to diagnose startup issues.
If the MCP connection does not respond, test the MCP server directly inside the container by invoking the runtime command and checking the container logs for runtime errors.
Scrape one or more URLs and store their content in the vector database; supports single URLs and batch processing.
Intelligently crawl a full website based on the type of URL (sitemap, llms-full.txt, or a regular webpage) with recursive traversal.
Retrieve a list of all available content sources (domains) in the database.
Run a semantic search over crawled content with optional source filtering to retrieve relevant results.
Comprehensive web search that connects SearXNG results with automated scraping and RAG processing; returns either RAG-processed results or raw markdown content.
Search specifically for code examples and summaries from crawled documentation (requires USE_AGENTIC_RAG=true).
Parse a GitHub repository into a Neo4j knowledge graph across multiple languages.
Parse local Git repositories directly without cloning, supporting multi-language codebases.
Parse specific branches of repositories for version-specific analysis.
Perform semantic search across multiple languages to identify similar patterns.
Analyze Python scripts for AI hallucinations by validating imports and usage against the knowledge graph.
Explore and query the Neo4j knowledge graph with commands for repos, classes, methods, and more.
Get information about script analysis setup, available paths, and usage instructions for hallucination detection tools.
Intelligent code search combining Qdrant semantic search with Neo4j structural validation and confidence scoring.
Bridge Neo4j data into Qdrant for searchable code examples with rich metadata.
Dual-validation hallucination detection using Neo4j and Qdrant with merged confidence scores.