home / mcp / shadowcrawl mcp server

ShadowCrawl MCP Server

Self-hosted MCP server that enables HITL rendering and advanced anti-bot data extraction from protected sites.

Installation
Add the following to your MCP client configuration file.

Configuration

View docs
{
  "mcpServers": {
    "devshero-shadowcrawl": {
      "command": "docker",
      "args": [
        "compose",
        "-f",
        "/YOUR_PATH/shadowcrawl/docker-compose-local.yml",
        "exec",
        "-i",
        "-T",
        "shadowcrawl",
        "shadowcrawl-mcp"
      ],
      "env": {
        "RUST_LOG": "info",
        "MAX_LINKS": "100",
        "QDRANT_URL": "http://localhost:6344",
        "SEARXNG_URL": "http://localhost:8890",
        "IP_LIST_PATH": "/YOUR_PATH/shadowcrawl/ip.txt",
        "OUTBOUND_LIMIT": "32",
        "BROWSERLESS_URL": "http://localhost:3010",
        "BROWSERLESS_TOKEN": "mcp_stealth_session",
        "HTTP_TIMEOUT_SECS": "30",
        "MAX_CONTENT_CHARS": "10000",
        "PROXY_SOURCE_PATH": "/YOUR_PATH/shadowcrawl/proxy_source.json",
        "HTTP_CONNECT_TIMEOUT_SECS": "10"
      }
    }
  }
}

ShadowCrawl MCP is a self-hosted orchestration layer that enables you to run a capable MCP server for bypassing modern anti-bot protections and extracting data from guarded sites. It combines local browser-based rendering with HITL capabilities to support high-fidelity data collection while preserving privacy and control. This guide shows you how to use the MCP with clients, how to install it, and important configuration notes for reliable operation.

How to use

You will connect to ShadowCrawl MCP from your MCP client to perform web scraping tasks with or without human-in-the-loop rendering. Choose between a Docker-based MCP server for standard scraping or a local MCP server to enable HITL workflows such as manual CAPTCHA solving and login flows.

To leverage the HITL workflow, ensure you run a local MCP server so you can launch a visible Brave/Chrome browser window and let the agent operate with your real browser profile. For routine scrapes without GUI rendering, the Docker-based server provides a headless path that still routes through ShadowCrawl’s logic.

How to install

Prerequisites: you need a system capable of running Docker and Rust tools for compiling the native server. Install a modern browser (Brave is recommended) and ensure you can grant accessibility permissions for automation control if you plan to use HITL.

1) Set up the Docker route to run ShadowCrawl as a full stack with SearXNG and a proxy manager. This is the quickest way to get started for standard scraping without HITL.

2) Build the native MCP server to enable non_robot_search HITL rendering. Use the following steps to compile the MCP binary and enable the HITL feature.

Available tools

non_robot_search

Flagship tool enabling HITL rendering by launching a visible Brave/Chrome browser to solve captchas and login walls while the agent continues scraping.

fetch_web_high_fidelity

High-fidelity rendering workflow used for boss-level data extraction with HITL support.

ShadowCrawl MCP

The self-hosted MCP server that coordinates rendering, data extraction, and protection bypass.

SearXNG

Federated search engine used to supplement discovery during crawling.

Qdrant

Semantic memory store used for long-term recall of gathered data.