home / mcp / berryrag mcp server

BerryRAG MCP Server

Local vector DB with Playwright MCP integration to scrape, embed, and query Claude-facing content.

Installation
Add the following to your MCP client configuration file.

Configuration

View docs
{
  "mcpServers": {
    "berrydev-ai-berry-rag": {
      "url": "https://mcp.berrydev.ai/berry-rag"
    }
  }
}

You set up BerryRAG as an MCP-enabled local knowledge base that combines a vector database with Playwright-based web scraping. This lets Claude access up-to-date, scraped content processed into embeddings, so you can search, retrieve context, and get precise answers from your own data.

How to use

Configure Claude Desktop to connect to the two MCP servers used by BerryRAG. This enables you to drive Playwright scraping through Claude and to operate the local vector database directly from Claude’s workflow.

"mcpServers": {
  "playwright": {
    "command": "npx",
    "args": ["@playwright/mcp@latest"]
  },
  "berry-rag": {
    "command": "node",
    "args": ["mcp_servers/vector_db_server.js"],
    "cwd": "/Users/eberry/BerryDev/berry-rag"
  }
}

Typical workflow with Claude and BerryRAG

1) Scrape target content via Playwright MCP through Claude using your configured servers. 2) Process scraped content into the local vector database. 3) Query and search your knowledge base to retrieve context and answers.

npm run process-scraped
npm run search "React hooks"

Available tools

add_document

Adds content directly to the local vector DB so it can be searched and contextualized.

search_documents

Find similar content in the vector DB to support answering questions.

get_context

Return Claude-formatted context for a given query, suitable for feeding into a chat or assistant.

list_documents

List all stored documents in the vector DB to inspect your knowledge base.

get_stats

Provide vector database statistics for monitoring health and size.

process_scraped_files

Process files scraped by Playwright MCP and embed them into the vector DB.

save_scraped_content

Save scraped material for later processing or auditing.

crawl_content

(BerryExa Server) Extract and crawl web content with subpage support.

extract_links

(BerryExa Server) Retrieve internal links to enable subpage discovery.

get_content_preview

(BerryExa Server) Preview content without full processing to speed up workflows.