home / mcp / mcp desktop tools mcp server
Provides desktop automation capabilities for Claude, including browser control, screenshots, and input actions via an MCP server.
Configuration
View docs{
"mcpServers": {
"k1ta141k-mcp-desktop-tools": {
"command": "node",
"args": [
"C:/Users/<you>/mcp-desktop-tools/dist/index.js"
]
}
}
}MCP Desktop Tools provides a local MCP server that enables Claude to automate desktop tasks, control browsers, capture screenshots, simulate mouse and keyboard input, manage windows, and access the clipboard. This makes it possible to automate both browser-based workflows and native desktop actions from Claude.
You run a local MCP server and connect Claude Code to it as an MCP endpoint. The server exposes a set of tools that let you automate browser actions, take screenshots, and control desktop windows and input. Use these tools by sending requests through the MCP client in Claude Code, composing actions in a sequence to automate complex tasks.
Prerequisites you need before starting are Node.js 18 or later and a Windows environment if you plan to use native window, mouse, and keyboard operations.
Install dependencies and build the server, then install the required browser binaries.
npm install
npm run build
npx playwright install chromiumConfigure Claude Code to connect to the MCP server by adding a server entry that points to the local runtime. Use the following example configuration path and snippet.
{
"mcpServers": {
"desktop-tools": {
"command": "node",
"args": ["C:/Users/<you>/mcp-desktop-tools/dist/index.js"]
}
}
}Launch Chromium and navigate to a URL.
Navigate to a URL with configurable wait conditions.
Click elements by CSS selector.
Type into input fields, optionally clear or press Enter.
Read page content (text, HTML, title, URL, or specific elements).
Capture viewport or full-page screenshots.
Close the browser.
Capture the entire screen across multiple monitors.
Capture a rectangular region by coordinates.
Capture a specific window by title (partial match).
Click at screen coordinates.
Move the cursor, either instantly or with smooth animation.
Type text using simulated keystrokes.
Press keyboard shortcuts like ctrl+c or alt+tab.
List all visible windows with positions and sizes.
Focus a window by its title.
Move and resize a window.
Launch applications by path, name, or URI.
Read text from the clipboard.
Write text to the clipboard.