home / mcp / selenium mcp server
Python-based MCP server for automating Selenium WebDriver actions via Claude with browser control, navigation, interactions, and screenshots.
Configuration
View docs{
"mcpServers": {
"jyothishkumarav-selenium-mcp-server-python": {
"command": "python",
"args": [
"server.py"
]
}
}
}You can automate web browsers programmatically using a Python MCP server that exposes Selenium WebDriver operations through an MCP client. This server lets you start and manage browser sessions, perform common interactions, take screenshots, and inspect page content, all from a client such as Claude’s desktop application.
Use an MCP client to connect to the Selenium MCP server and perform browser automation tasks. You can start a browser session, navigate to pages, interact with elements, capture screenshots, manage windows and frames, and access page data such as content and local storage. Workflows typically involve starting a session, performing a sequence of actions, and then closing the session.
Follow these steps to set up the Selenium MCP Server locally and prepare it for use with your MCP client.
# Clone the project
git clone https://github.com/Jyothishkumarav/selenium-mcp-server-python.git
cd selenium-mcp-server-python
# Install dependencies
pip install -r requirements.txt
# Install the MCP server component in Claude
mcp install server.py
# Run the server locally (this starts the MCP server)
python server.pyOpen a new browser session to run automation tasks.
Close an active browser session and free resources.
Switch focus between browser windows within a session.
Navigate the active browser to a specified URL.
Refresh the current page.
Wait for the page to complete loading before proceeding.
Locate an element on the page for interaction.
Click on a located element.
Type text into a focused element.
Clear text from an input field.
Double-click a targeted element.
Open the context menu on a targeted element.
Retrieve visible text from an element.
Fetch attributes of a located element.
Check whether an element exists in the DOM.
Check whether an element is visible to the user.
Check whether an element is selected (e.g., checkbox, option).
Capture a screenshot of the current page or element.
Retrieve the full HTML content of the current page.
Scroll the page in the viewport.
Manage local storage for the active page.
Switch context to a specified iframe.
Return to the top-level browsing context.