home / mcp / playmcp browser automation server
Provides 38 browser automation tools for web scraping, testing, and automation using Playwright via MCP commands.
Configuration
View docs{
"mcpServers": {
"jomon003-playmcp": {
"command": "node",
"args": [
"./dist/server.js"
]
}
}
}PlayMCP Browser Automation Server is a comprehensive MCP (Model Context Protocol) server that uses Playwright to automate browser actions. It enables you to control a browser programmatically for scraping, testing, and automation tasks, backed by a wide set of tools that cover navigation, interaction, data extraction, file handling, and advanced browser management.
You run the PlayMCP Browser Automation Server locally and connect to it with an MCP client. Start the server, then issue commands to open a browser, navigate to pages, interact with elements, extract data, and manage screenshots or DOM analysis. Use the tools to build automated flows for scraping, testing, form-filling, and page inspection. The server exposes a rich set of capabilities that let you: navigate pages, click elements, type text, move the mouse, capture screenshots, read page content and meta data, inspect DOM structure, monitor console and network activity, and execute custom JavaScript.
To begin, install dependencies, start the server, and then drive it from your MCP client by invoking the available tools in sequence to perform your automation tasks.
# Prerequisites
# - Node.js 16+
# - Git
# Clone the project
git clone https://github.com/jomon003/PlayMCP.git
cd PlayMCP
# Install dependencies
npm install
# Build the project
npm run build
# Install browser binaries for Playwright
npx playwright install
# Start the server (example: provide your MCP client with the appropriate config to connect)
npm run startBasic runtime example shown here demonstrates starting the server and then using an MCP client to query available tools. The server is designed to be used with an MCP configuration that points to a local stdio process, typically starting with node and the built server file.
You can configure the server as an MCP stdio backend using a JSON configuration. The configuration launches the server with Node and the built distribution script. You may provide a working directory and an explicit startup command.
{
"servers": {
"playmcp_browser": {
"type": "stdio",
"command": "node",
"args": ["./dist/server.js"],
"cwd": "/path/to/PlayMCP",
"description": "Browser automation server using Playwright"
}
}
}If you encounter issues, ensure the server is started before issuing browser commands, verify Node.js is version 16 or higher, and confirm that Playwright browsers are installed. For debugging, run with headless mode off to observe interactions, and allocate sufficient memory if running multiple browser instances.
Typical automation sequences include opening a browser, navigating to a page, extracting data such as links or forms, performing actions like clicks and form submissions, and finally capturing screenshots or the page source.
The server provides core browser control (open/close browser, navigate, click, type, move mouse, scroll) and page content access (HTML source, visible text, title, URL). It also supports deep DOM analysis, form extraction, and access to headers like scripts, stylesheets, and meta tags.
The server expects a Node.js environment and uses Playwright to drive browsers. Build steps include installing dependencies, compiling TypeScript sources, and ensuring browser binaries are installed via Playwright. Use npm scripts to build and start the server.
Run the server in a controlled environment, limit accessible endpoints to trusted clients, and monitor resource usage when running multiple browser instances. When testing, prefer headless:false during debugging to visualize actions, then switch to headless:true for production runs.
This server exposes a range of tools for navigation, interaction, data extraction, forms handling, and JavaScript execution. You can script complex automation flows by combining multiple tools and handling wait conditions, timeouts, and errors gracefully.
Launch a new browser instance with optional headless mode to begin automation.
Terminate the current browser session and free resources.
Navigate to a specified URL in the active page.
Click a DOM element identified by a selector with smart resolution.
Type text into an input field or editable element.
Move the mouse to specified coordinates within the page.
Scroll the page by given x and y offsets with optional smooth behavior.
Capture a screenshot of the full page, viewport, or a chosen element.
Retrieve the complete HTML source of the current page.
Extract the visible text content from the current page.
Return the title of the current page.
Get the current URL of the active page.
Extract all JavaScript code blocks from the page.
Extract all linked CSS stylesheets from the page.
Collect all meta tags and their attributes from the head section.
Retrieve all page links with href, text, and title metadata.
Collect information about images including src and attributes.
Analyze forms and their fields on the page.
Get HTML and text content for a specific element.
Analyze the DOM hierarchy starting from a selector with configurable depth.
Monitor browser console messages for debugging and validation.
Track HTTP requests and responses during page interaction.
Run arbitrary JavaScript on the page and return results.
Execute JS and return a structured value from the evaluation.
Handle file input uploads by simulating user file selection.
Manage JavaScript dialogs such as alerts, confirms, and prompts.