home / mcp / playwright mcp server
Exposes core Playwright browser automation capabilities via a simple, scalable MCP API.
Configuration
View docs{
"mcpServers": {
"alexrwilliam-playwright-mcp-server": {
"command": "playwright-mcp",
"args": [
"stdio"
]
}
}
}You can run a lightweight Playwright MCP (Model Context Protocol) server to automate browser tasks through a simple, scalable API. It exposes core browsing capabilities—like navigation, DOM interactions, screenshots, and network inspection—in a way that lets you build higher-level automation and QA workflows without embedding Playwright directly in your client.
You connect to the MCP server from your client and issue commands that operate on a persistent browser context. Start with a local stdio server to keep everything self-contained, or run an HTTP transport if you prefer remote control. The server tracks open pages/tabs, lets you switch between them, and supports multiple pages, popups, and plugins that extend interactions across tabs.
Key capabilities you’ll use include navigating to URLs, interacting with page elements, querying for elements, collecting snapshots (HTML, screenshots, accessibility trees), and monitoring network activity. Outputs are returned in a raw, unmodified form from Playwright so you can post-process or feed them directly into tools and workflows.
Prerequisites: you need Python and a Python package manager, plus Playwright browsers installed.
# Install directly from source and install Playwright browsers
pip install git+https://github.com/alexrwilliam/playwright-mcp-server.git
playwright installRun the MCP server in stdio mode for local control or in http mode for remote control. In headed mode you can visually observe the browser.
If you prefer to run the server with a real browser channel, you can specify the channel and optionally a user data directory to preserve cookies and profile data.
The server exposes a rich set of configuration options for response budgets, artifacts, and timeouts. You can tune how much inline data is returned, where large artifacts are stored, and how long artifacts are retained. You can also control how many elements are returned by element queries and how many nodes are included in accessibility snapshots.
Security and reliability considerations include limiting the size of inline responses, trimming overflow data, and using artifact previews to keep the main response compact while preserving access to full payloads when needed.
If you encounter connectivity issues, verify the transport type you started with (stdio or http) and ensure the server process is running. For large responses, use artifact accessors to retrieve full payloads without overwhelming the client.
Navigate to a URL. Returns the current page reference and state.
Reload the current page to refresh its state.
Go back in the browser history.
Go forward in the browser history.
Return the current page URL with parsed components.
Wait for the URL to match a pattern within a timeout.
Wait for the page to reach a specific load state such as domcontentloaded, load, or networkidle.
Set the browser viewport dimensions for the active page.
List all open pages or tabs with identifiers, URLs, and titles.
Switch the active context to a specific page by its identifier.
Close a specific page or tab.
Wait for a new popup or tab to open and capture its reference.
Switch to the most recently opened page.
Click an element matched by a selector.
Type text into an element matched by a selector.
Fill an input field with a value.
Clear the text from an input field.
Select an option in a dropdown or select element.
Hover the mouse over an element.
Scroll within a page or element to specified coordinates.
Press a keyboard key in the page context.
Check a checkbox element.
Uncheck a checkbox element.
Upload a file to a file input element.
Query for a single element using CSS, XPath, or Playwright locators with optional text caps.
Query for all matching elements with optional caps.
Provide a quick metadata preview for an element.
Check if an element is visible on the page.
Check if an element is enabled for interaction.
Wait for an element to appear within a timeout.
Get an element's position and size.
Get all attributes of an element.
Get a CSS computed style property for an element.
Execute JavaScript in the page context and return results.
Wait for network activity to settle.
Retrieve JavaScript errors from the page.
Retrieve console logs from the page.
Retrieve captured network requests with optional filtering.
Retrieve captured network responses with optional filtering.
Clear all captured network logs.
Intercept and handle network requests with a custom action.
Remove all route interceptors.
Wait for a specific network response to occur.
Extract the body of a network response.
Retrieve the page HTML content.
Generate an accessibility tree snapshot with optional filtering and node limits.
Capture a screenshot of the page or a specific element and store as an artifact.
Generate a PDF of the current page and return artifact metadata.
Add custom HTTP headers to all requests.
Change the browser User-Agent string.