home / mcp / webscraping.ai mcp server

WebScraping.AI MCP Server

A Model Context Protocol (MCP) server implementation that integrates with WebScraping.AI for web data extraction capabilities.

Installation
Add the following to your MCP client configuration file.

Configuration

View docs
{
  "mcpServers": {
    "webscraping-ai-webscraping-ai-mcp-server": {
      "command": "npx",
      "args": [
        "-y",
        "webscraping-ai-mcp"
      ],
      "env": {
        "WEBSCRAPING_AI_API_KEY": "YOUR_API_KEY",
        "WEBSCRAPING_AI_CONCURRENCY_LIMIT": "5",
        "WEBSCRAPING_AI_ENABLE_CONTENT_SANDBOXING": "true"
      }
    }
  }
}

You can run and use the WebScraping.AI MCP server to perform web data extraction and related tasks through an MCP client. It lets you ask questions about page content, extract structured data, retrieve rendered HTML, and more, all with configurable proxies, JavaScript rendering, device emulation, and safety boundaries for scraped content.

How to use

You connect your MCP-enabled client to the WebScraping.AI MCP server to perform web scraping tasks. Use the available tools to extract information from web pages, including questions about content, structured data fields, full HTML with JavaScript rendering, plain text, and content from specific page selectors. You can tailor the behavior with options like proxy type, country, device emulation, and whether to run custom on-page JavaScript.

For common workflows, you will specify the target URL and optional parameters such as rendering of JavaScript, wait-for selectors, proxy settings, and timeouts. The tools will return structured results or text content that you can feed into your downstream processes. If you enable content sandboxing, scraped content is wrapped to prevent it from being interpreted as executable instructions by language models.

How to install

Prerequisites: you need Node.js and npm installed on your system. Ensure you have a valid WebScraping.AI API key.

Option A — Run with npx (quick start):

env WEBSCRAPING_AI_API_KEY=your_api_key npx -y webscraping-ai-mcp

Option B — Manual installation (clone and run locally):

# Clone the repository
git clone https://github.com/webscraping-ai/webscraping-ai-mcp-server.git
cd webscraping-ai-mcp-server

# Install dependencies
npm install

# Start the server
npm start

Additional configuration and usage notes

Cursor configuration lets you reuse the MCP server in your projects. You can add a project-specific configuration or a global one so your AI agents can automatically access the WebScraping.AI tools when web scraping tasks appear.

Example project-specific configuration for Cursor includes a dedicated entry named webscraping-ai that runs the MCP server and passes environment variables for the API key, concurrency limit, and content sandboxing. You can adapt this to your team setup.

If you are using Claude Desktop, you can specify the MCP server in your claude_desktop_config.json with the command, arguments, and environment variables shown in the example block.

Security and benchmarking options are available through environment variables to enable content sandboxing, set concurrency, and adjust timeouts. Enable content sandboxing to wrap scraped content and protect against prompt injection when presenting data to language models.

Available tools

webscraping_ai_question

Ask questions about web page content, returning text answers based on the page you provide and optional rendering and wait-for conditions.

webscraping_ai_fields

Extract structured data from a page by providing a fields map that describes how to obtain each data item.

webscraping_ai_html

Retrieve the full HTML of a page after JavaScript rendering, suitable for analysis of the complete DOM.

webscraping_ai_text

Extract the visible text content from a rendered page, including optional JavaScript execution.

webscraping_ai_selected

Extract content from a specific element identified by a CSS selector.

webscraping_ai_selected_multiple

Extract content from multiple elements using an array of CSS selectors.

webscraping_ai_account

Retrieve information about your WebScraping.AI account usage and limits.