home / mcp / playmcp browser automation server

PlayMCP Browser Automation Server

Provides 38 browser automation tools for web scraping, testing, and automation using Playwright via MCP commands.

Installation
Add the following to your MCP client configuration file.

Configuration

View docs
{
  "mcpServers": {
    "jomon003-playmcp": {
      "command": "node",
      "args": [
        "./dist/server.js"
      ]
    }
  }
}

PlayMCP Browser Automation Server is a comprehensive MCP (Model Context Protocol) server that uses Playwright to automate browser actions. It enables you to control a browser programmatically for scraping, testing, and automation tasks, backed by a wide set of tools that cover navigation, interaction, data extraction, file handling, and advanced browser management.

How to use

You run the PlayMCP Browser Automation Server locally and connect to it with an MCP client. Start the server, then issue commands to open a browser, navigate to pages, interact with elements, extract data, and manage screenshots or DOM analysis. Use the tools to build automated flows for scraping, testing, form-filling, and page inspection. The server exposes a rich set of capabilities that let you: navigate pages, click elements, type text, move the mouse, capture screenshots, read page content and meta data, inspect DOM structure, monitor console and network activity, and execute custom JavaScript.

To begin, install dependencies, start the server, and then drive it from your MCP client by invoking the available tools in sequence to perform your automation tasks.

How to install

# Prerequisites
# - Node.js 16+ 
# - Git

# Clone the project
git clone https://github.com/jomon003/PlayMCP.git
cd PlayMCP

# Install dependencies
npm install

# Build the project
npm run build

# Install browser binaries for Playwright
npx playwright install

# Start the server (example: provide your MCP client with the appropriate config to connect)
npm run start

Basic runtime example shown here demonstrates starting the server and then using an MCP client to query available tools. The server is designed to be used with an MCP configuration that points to a local stdio process, typically starting with node and the built server file.

Configuration and usage notes

You can configure the server as an MCP stdio backend using a JSON configuration. The configuration launches the server with Node and the built distribution script. You may provide a working directory and an explicit startup command.

{
  "servers": {
    "playmcp_browser": {
      "type": "stdio",
      "command": "node",
      "args": ["./dist/server.js"],
      "cwd": "/path/to/PlayMCP",
      "description": "Browser automation server using Playwright"
    }
  }
}

Troubleshooting and tips

If you encounter issues, ensure the server is started before issuing browser commands, verify Node.js is version 16 or higher, and confirm that Playwright browsers are installed. For debugging, run with headless mode off to observe interactions, and allocate sufficient memory if running multiple browser instances.

Example workflows

Typical automation sequences include opening a browser, navigating to a page, extracting data such as links or forms, performing actions like clicks and form submissions, and finally capturing screenshots or the page source.

Core capabilities and data extraction

The server provides core browser control (open/close browser, navigate, click, type, move mouse, scroll) and page content access (HTML source, visible text, title, URL). It also supports deep DOM analysis, form extraction, and access to headers like scripts, stylesheets, and meta tags.

Supported environments and build notes

The server expects a Node.js environment and uses Playwright to drive browsers. Build steps include installing dependencies, compiling TypeScript sources, and ensuring browser binaries are installed via Playwright. Use npm scripts to build and start the server.

Security and best practices

Run the server in a controlled environment, limit accessible endpoints to trusted clients, and monitor resource usage when running multiple browser instances. When testing, prefer headless:false during debugging to visualize actions, then switch to headless:true for production runs.

Notes on development and tooling

This server exposes a range of tools for navigation, interaction, data extraction, forms handling, and JavaScript execution. You can script complex automation flows by combining multiple tools and handling wait conditions, timeouts, and errors gracefully.

Available tools

openBrowser

Launch a new browser instance with optional headless mode to begin automation.

closeBrowser

Terminate the current browser session and free resources.

navigate

Navigate to a specified URL in the active page.

click

Click a DOM element identified by a selector with smart resolution.

type

Type text into an input field or editable element.

moveMouse

Move the mouse to specified coordinates within the page.

scroll

Scroll the page by given x and y offsets with optional smooth behavior.

screenshot

Capture a screenshot of the full page, viewport, or a chosen element.

getPageSource

Retrieve the complete HTML source of the current page.

getPageText

Extract the visible text content from the current page.

getPageTitle

Return the title of the current page.

getPageUrl

Get the current URL of the active page.

getScripts

Extract all JavaScript code blocks from the page.

getStylesheets

Extract all linked CSS stylesheets from the page.

getMetaTags

Collect all meta tags and their attributes from the head section.

getLinks

Retrieve all page links with href, text, and title metadata.

getImages

Collect information about images including src and attributes.

getForms

Analyze forms and their fields on the page.

getElementContent

Get HTML and text content for a specific element.

getElementHierarchy

Analyze the DOM hierarchy starting from a selector with configurable depth.

getConsoleMessages

Monitor browser console messages for debugging and validation.

getNetworkRequests

Track HTTP requests and responses during page interaction.

executeJavaScript

Run arbitrary JavaScript on the page and return results.

evaluateWithReturn

Execute JS and return a structured value from the evaluation.

uploadFiles

Handle file input uploads by simulating user file selection.

handleDialog

Manage JavaScript dialogs such as alerts, confirms, and prompts.