Home / MCP / ScrapeGraph MCP Server
Provides enterprise-grade MCP web scraping with markdown outputs, AI-driven extraction, multi-page crawling, and agentic workflows.
Configuration
View docs{
"mcpServers": {
"scrapegraph_mcp": {
"command": "npx",
"args": [
"-y",
"@smithery/cli@latest",
"run",
"@ScrapeGraphAI/scrapegraph-mcp",
"--config",
"\"{\\\"scrapegraphApiKey\\\":\\\"YOUR-SGAI-API-KEY\\\"}\""
],
"env": {
"SGAI_API_KEY": "your-api-key"
}
}
}
}You can run and use the ScrapeGraph MCP Server to enable language models to perform AI-powered web scraping with robust reliability. This server exposes dedicated tools for converting pages to markdown, extracting structured data, crawling multi-page sites, and coordinating agentic scraping workflows, all through a standard MCP client.
Connect to the ScrapeGraph MCP Server using an MCP client such as Claude Desktop or Cursor. Once connected, you can request actions like converting a page to markdown, extracting product data from a page, performing multi-page crawls, or running complex agentic scrapes. You interact with the server through natural language prompts, and the MCP client handles sending your requests to the server and receiving structured results.
Prerequisites: Python 3.13 or higher and a ScrapeGraph API key.
Option A — Install via Smithery (recommended)
npx -y @smithery/cli install @ScrapeGraphAI/scrapegraph-mcp --client claudeOption B — Run locally from source (development and testing)
# Clone the project
git clone https://github.com/ScrapeGraphAI/scrapegraph-mcp
cd scrapegraph-mcp
# Install dependencies
pip install -e .
# Set your API key
export SGAI_API_KEY=your-api-key
# Run the server
scrapegraph-mcp
# or
python -m scrapegraph_mcp.serverConfigure how you connect from your MCP client by providing the API key in your environment or via client configuration. See configuration examples below for Claude Desktop and local development.
API keys must be provided to enable all features. Set the API key in your environment or pass it through your MCP client configuration when starting the server.
Claude Desktop configuration for connecting to a remote Smithery-hosted MCP server: you expose the server under an MCP name and provide the start command for the stdio transport. The example below uses npx to run the MCP server with your API key embedded in the config payload.
{
"mcpServers": {
"scrapegraph_mcp": {
"command": "npx",
"args": [
"-y",
"@smithery/cli@latest",
"run",
"@ScrapeGraphAI/scrapegraph-mcp",
"--config",
"\"{\\\"scrapegraphApiKey\\\":\\\"YOUR-SGAI-API-KEY\\\"}\""
]
}
}
}Local development server configuration for Claude Desktop (example): the server runs as a local stdio service using Python with the module path. This requires Python and the API key in the environment.
{
"mcpServers": {
"scrapegraph_mcp_local": {
"command": "python",
"args": [
"-m",
"scrapegraph_mcp.server"
],
"env": {
"SGAI_API_KEY": "your-api-key-here"
}
}
}
}Always keep your API key confidential. Use environment variables where possible and avoid embedding keys directly in code or command lines. If your MCP client supports secure storage for credentials, prefer that approach.
If the server does not start, verify that Python is installed and the package is installed. Check that the API key is set in your environment or in the client configuration.
If tools do not appear in your MCP client, confirm that the server is running without errors and that the client is configured to access the correct MCP server name.
To test locally, run the server and use the MCP Inspector tool to exercise all available tools. Ensure you have a valid API key set in the environment before issuing test prompts.
Single-page tasks include converting a page to markdown with markdownify, extracting structured data with smartcrawler or agentic_scrapper, and performing basic or advanced crawling with smartcrawler_initiate and smartcrawler_fetch_results.
Transform any webpage into clean, structured markdown format.
AI-powered extraction of structured data from webpages with support for infinite scrolling.
AI-powered web searches with structured results.
Fetch page content with optional heavy JavaScript rendering.
Extract sitemap URLs and structure for a website.
Initiate asynchronous multi-page crawling with optional AI extraction or markdown mode and returns a request_id.
Poll for crawling results using the request_id.
Run advanced agentic scraping workflows with customizable steps and output schemas.