home / mcp / mcp selenium server
An MCP implementation for Selenium WebDriver
Configuration
View docs{
"mcpServers": {
"angiejones-mcp-selenium": {
"command": "npx",
"args": [
"-y",
"@angiejones/mcp-selenium@latest"
]
}
}
}You can control Selenium WebDriver through an MCP server, enabling AI agents to automate browser actions by sending high-level commands like starting a browser, navigating, interacting with elements, and taking screenshots. This server acts as a bridge between AI agents and browser automation, making it easy to perform browser tasks without writing explicit scripts.
To use this MCP server, connect an MCP client to the provided command configuration and start issuing browser automation actions. You can launch a browser session, navigate to pages, interact with elements, enter text, capture screenshots, manage cookies, and execute custom scripts. The server exposes a set of tools that map directly to common browser automation tasks, allowing you to describe your intent and let the agent orchestrate the steps.
Prerequisites: ensure you have Node.js and npm installed on your system. You will also need a capable browser driver setup for local testing if you plan to run tests or validate locally.
git clone https://github.com/angiejones/mcp-selenium.git
cd mcp-selenium
npm installYou can connect using an MCP client with the following standard, explicit configuration. This is the recommended way to launch the MCP Selenium server from a client perspective.
{
"mcpServers": {
"selenium": {
"command": "npx",
"args": ["-y", "@angiejones/mcp-selenium@latest"]
}
}
}Supported browsers include Chrome, Firefox, Edge, and Safari. For Safari on macOS, enable remote automation by running the Safari driver setup and enabling remote automation in Safari settings. There is no headless mode available for Safari in this configuration.
Development steps shown in the source include cloning the project, installing dependencies, and running tests that require Chrome with chromedriver on PATH. There are also instructions for installing a CLI wrapper and a global installation for easy access to the MCP server tooling.
git clone https://github.com/angiejones/mcp-selenium.git
cd mcp-selenium
npm install
npm test
# Install via Smithery (CLI) for claude integration
npx -y @smithery/cli install @angiejones/mcp-selenium --client claude
# Install globally for easy use
npm install -g @angiejones/mcp-selenium
mcp-seleniumThe server provides a comprehensive set of browser automation actions you can invoke through your MCP client, including launching browsers, navigating pages, interacting with elements, sending keys, retrieving text and attributes, handling windows and frames, managing alerts and cookies, taking screenshots, and running custom scripts.
Launches a browser session with a chosen browser type and optional settings such as headless mode or startup arguments.
Directs the active browser to load a specified URL.
Performs a mouse action on an element using a locator strategy like id, css, xpath, name, tag, or class.
Types text into an element, clearing the field first, using a locator to identify the target.
Reads and returns the visible text content of a selected element.
Retrieves a specified attribute value from a page element.
Simulates pressing a keyboard key, such as Enter or Tab.
Uploads a file by interacting with a file input element using a given path.
Captures a screenshot of the current page, returning data or saving to a file.
Ends the current browser session and releases resources.
Executes custom JavaScript in the browser for advanced interactions.
Manages browser windows and tabs, including switching and listing handles.
Switches focus to a frame or returns to the main document.
Handles browser alerts, confirms, or prompts with actions like accept or dismiss.
Adds a browser cookie for the current page domain.
Retrieves cookies for the current page or a specific cookie by name.
Deletes cookies either by name or all cookies.
Fetches browser diagnostics such as console, errors, or network data.