Home / MCP / MCP Selenium Server

MCP Selenium Server

Provides an MCP server to automate browser actions via Selenium WebDriver for MCP clients.

javascript
Installation
Add the following to your MCP client configuration file.

Configuration

View docs
{
    "mcpServers": {
        "selenium": {
            "command": "mcp-selenium",
            "args": []
        }
    }
}

You can automate browser interactions through a dedicated MCP server that exposes Selenium WebDriver capabilities to MCP clients. This server lets you start browser sessions, navigate pages, locate elements, interact with the page, take screenshots, and handle file uploads in a structured, MCP-friendly way.

How to use

With an MCP client, you connect to the Selenium MCP server to drive browser automation. You can start a browser session, load web pages, find and interact with elements, perform advanced actions like drag-and-drop or keyboard input, and capture screenshots as part of your automated tests or workflows. This server supports popular browsers and can run in headless mode for CI and automation pipelines.

How to install

Prerequisites you need before installing include a working Node.js and npm environment. Make sure Node.js is installed on your machine.

Install MCP Selenium locally or globally so you can run the server and integrate it with your MCP clients.

# Install globally
npm install -g @angiejones/mcp-selenium

# Or run via NPX without installation
npx -y @angiejones/mcp-selenium

Configuration for MCP clients

To connect an MCP client to the Selenium MCP server, configure the client to start the MCP server using NPX. The following example shows how to define the MCP server in your client configuration.

{
  "mcpServers": {
    "selenium": {
      "command": "npx",
      "args": ["-y", "@angiejones/mcp-selenium"]
    }
  }
}

Available tools

start_browser

Launches a browser session with configurable options, including browser type and headless mode.

navigate

Navigates the active browser to a specified URL.

find_element

Finds an element using locator strategies such as id, css, xpath, name, tag, or class, with an optional timeout.

click_element

Clicks an element identified by a locator strategy, with an optional timeout.

send_keys

Sends keystrokes to an element located by a strategy, enabling typing into inputs.

get_element_text

Retrieves the text content of a located element.

hover

Moves the mouse to hover over a located element.

drag_and_drop

Drags a source element and drops it onto a target element.

double_click

Performs a double-click action on a located element.

right_click

Performs a right-click on a located element.

press_key

Simulates pressing a specific keyboard key.

upload_file

Uploads a file by setting a file input element's value to a file path.

take_screenshot

Captures a screenshot of the current page and saves it or returns data.

close_session

Closes the active browser session and cleans up resources.