The LlamaIndex MCP Server allows you to create powerful, local Model Context Protocol servers that connect AI models like Claude to your private data sources through retrieval-augmented generation (RAG). This enables AI assistants to access and reference your specific information when answering your questions.
.env
file in the root directory with the following variables:
LLAMA_CLOUD_API_KEY=your_llama_cloud_api_key
OPENAI_API_KEY=your_openai_api_key
The server uses a simple Python script with the following structure:
from mcp_server import FastMCP
import os
mcp = FastMCP('llama-index-server')
@mcp.tool()
def llama_index_documentation(query: str) -> str:
"""Search the llama-index documentation for the given query."""
# Initialize your LlamaCloud index
index = LlamaCloudIndex(
name="your-index-name",
project_name="Your Project",
organization_id="your-org-id",
api_key=os.getenv("LLAMA_CLOUD_API_KEY"),
)
# Execute the query
response = index.as_query_engine().query(query + " Be verbose and include code examples.")
return str(response)
if __name__ == "__main__":
mcp.run(transport="stdio")
Claude
→ Settings
→ Developer
→ Edit Config
{
"mcpServers": {
"llama_index_docs_server": {
"command": "poetry",
"args": [
"--directory",
"/path/to/your/llamacloud-mcp",
"run",
"python",
"/path/to/your/llamacloud-mcp/mcp-server.py"
]
}
}
}
For HTTP-based MCP access, you'll need to use a slightly modified server:
from mcp_server import FastMCP
import asyncio
mcp = FastMCP('llama-index-server', port=8000)
# Add your tools with @mcp.tool() decorator here
if __name__ == "__main__":
asyncio.run(mcp.run_sse_async())
from llama_index.multi_modal_llms.openai import OpenAI
from llama_index.agent import FunctionAgent
from llama_index.tools.mcp import BasicMCPClient, McpToolSpec
import asyncio
# Connect to your MCP server
mcp_client = BasicMCPClient("http://localhost:8000/sse")
mcp_tool_spec = McpToolSpec(
client=mcp_client,
# Optional: Filter tools by name
# allowed_tools=["tool1", "tool2"],
)
# Get available tools
tools = mcp_tool_spec.to_tool_list()
# Create an agent with these tools
llm = OpenAI(model="gpt-4o-mini")
agent = FunctionAgent(
tools=tools,
llm=llm,
system_prompt="You are an agent that knows how to build agents in LlamaIndex.",
)
# Run a query through the agent
async def run_agent():
response = await agent.run("How do I instantiate an agent in LlamaIndex?")
print(response)
if __name__ == "__main__":
asyncio.run(run_agent())
You can add multiple tools to your MCP server by creating additional decorated functions:
@mcp.tool()
def search_documentation(query: str) -> str:
"""Search the documentation for information."""
# Implementation here
@mcp.tool()
def generate_code_example(description: str) -> str:
"""Generate a code example based on the description."""
# Implementation here
Each tool will be available to the client (Claude Desktop or your custom application) as a separate function that can be invoked.
There are two ways to add an MCP server to Cursor. The most common way is to add the server globally in the ~/.cursor/mcp.json
file so that it is available in all of your projects.
If you only need the server in a single project, you can add it to the project instead by creating or adding it to the .cursor/mcp.json
file.
To add a global MCP server go to Cursor Settings > MCP and click "Add new global MCP server".
When you click that button the ~/.cursor/mcp.json
file will be opened and you can add your server like this:
{
"mcpServers": {
"cursor-rules-mcp": {
"command": "npx",
"args": [
"-y",
"cursor-rules-mcp"
]
}
}
}
To add an MCP server to a project you can create a new .cursor/mcp.json
file or add it to the existing one. This will look exactly the same as the global MCP server example above.
Once the server is installed, you might need to head back to Settings > MCP and click the refresh button.
The Cursor agent will then be able to see the available tools the added MCP server has available and will call them when it needs to.
You can also explictly ask the agent to use the tool by mentioning the tool name and describing what the function does.