home / mcp / browser mcp server
Provides a browser automation MCP server that lets AI assistants control a real browser for navigation, form interaction, data extraction, and more.
Configuration
View docs{
"mcpServers": {
"saik0s-mcp-browser-use": {
"url": "http://localhost:8383/mcp",
"headers": {
"GEMINI_API_KEY": "YOUR_API_KEY",
"OPENAI_API_KEY": "openai-key",
"ANTHROPIC_API_KEY": "anthropic-key"
}
}
}
}You run a programmable browser automation server that lets AI agents control a real browser via the Model Context Protocol. It supports multiple LLM providers, a dedicated deep-research tool, and a CLI for testing. You configure it with environment variables, connect to a live browser if you want, and run tasks that automate navigation, form filling, and data extraction in a repeatable, scriptable way.
To use this MCP server, you start the server with a command that launches the browser automation backend and then connect your MCP client (such as an AI agent or a testing script) to it. You can run two core capabilities: (1) automate a browser task end-to-end using natural language instructions, and (2) perform deep web research and generate a structured report. Your environment variables define which LLMs you use, how the browser runs, and where artifacts are stored. When you issue tasks through your MCP client, the server handles navigation, form interactions, and data extraction, returning results or detailed errors.
Examples of practical usage patterns include: composing a sequence of browser actions to gather information from multiple pages, running a research task that compiles a multi-page report, and optionally keeping a browser session alive across multiple MCP calls for faster interactions. If you choose to connect to your own browser via CDP, you can reuse an already-running Chrome/Chromium instance that you launch yourself.
Prerequisites you need before installing: a Python-enabled runtime, a modern node-compatible toolchain, and a supported browser automation runtime. You will also need network access to fetch LLM model keys and any required API credentials.
Step 1: Install the UV bootstrapper for Python environments. Run the following command to install UV, the high-speed installer:
curl -LsSf https://astral.sh/uv/install.sh | shStep 2: Install Playwright browsers, which are required for browser automation. Run:
uvx --from mcp-server-browser-use@latest python -m playwright installConfigure how the MCP server runs using environment variables. The key sections define the LLM provider, browser settings, agent tooling behavior, and research tooling. You will often place these settings in a .env file and load them when starting the server.
{
"mcpServers": {
"browser_use": {
"command": "uvx",
"args": ["mcp-server-browser-use@latest"],
"env": {
"MCP_LLM_OPENROUTER_API_KEY": "YOUR_KEY_HERE_IF_USING_OPENROUTER",
"MCP_LLM_PROVIDER": "openrouter",
"MCP_LLM_MODEL_NAME": "anthropic/claude-3.5-haiku",
"MCP_BROWSER_HEADLESS": "false",
"MCP_BROWSER_USE_OWN_BROWSER": "false",
"MCP_BROWSER_WINDOW_WIDTH": "1440",
"MCP_BROWSER_WINDOW_HEIGHT": "1080",
"MCP_AGENT_TOOL_HISTORY_PATH": "/path/to/your/history",
"MCP_RESEARCH_TOOL_SAVE_DIR": "/path/to/your/research",
"MCP_RESEARCH_TOOL_MAX_PARALLEL_BROWSERS": "5",
"MCP_PATHS_DOWNLOADS": "/path/to/your/downloads",
"MCP_SERVER_LOGGING_LEVEL": "DEBUG",
"MCP_SERVER_LOG_FILE": "/path/to/your/log/mcp_server_browser_use.log",
"MCP_LLM_GOOGLE_API_KEY": "YOUR_GOOGLE_API_KEY_IF_USING_GOOGLE",
"MCP_LLM_OPENAI_API_KEY": "YOUR_OPENAI_API_KEY_IF_USING_OPENAI",
"MCP_LLM_AZURE_OPENAI_API_KEY": "YOUR_AZURE_OPENAI_API_KEY",
"MCP_LLM_AZURE_OPENAI_ENDPOINT": "https://your-azure-endpoint/",
"MCP_LLM_OPENROUTER_API_KEY": "YOUR_OPENROUTER_KEY"
}
}
}
}You can attach the MCP server to a Chrome/Chromium browser you launch yourself with remote debugging enabled. Start Chrome with a remote debugging port, then point the server to that port via CDP URL.
Steps to connect your own browser:
A command-line interface lets you test core capabilities directly from your terminal. You can run a browser task or a deep research task without building a full client. Use the CLI to iterate quickly during development and scripting.
If the server fails to start due to a missing setting, ensure all mandatory environment variables are defined in your environment or loaded from your .env file. If you are not using CDP, verify there are no conflicting browser instances using the same data directory. For connectivity issues with CDP, confirm the port matches and that the browser is accessible. Validate API keys and endpoints for all LLM providers. If vision features are enabled, ensure your LLM supports vision.
Development and testing guidance covers syncing dependencies, installing browser automation runtimes, and using the inspector to connect to a running browser. You can run a local development flow that starts the MCP server with a specific dev path and then run browser tasks or deep research from the CLI.
Execute browser automation tasks by describing the goal in plain language and letting the agent perform navigation, interaction, and data extraction.
Perform multi-source web searches and synthesize findings into a markdown report.
List learned skills stored in the skills directory.
Get the full definition of a specific skill by name.
Delete a learned skill by name.
Check server health and list running tasks.
Query recent tasks with optional filters.
Retrieve full details for a specific task.