home / mcp / windows-mcp mcp server
Provides Windows automation and UI interaction via MCP clients, enabling file navigation, app control, UI automation, and QA testing.
Configuration
View docs{
"mcpServers": {
"cursortouch-windows-mcp": {
"command": "uvx",
"args": [
"windows-mcp"
],
"env": {
"ANONYMIZED_TELEMETRY": "false"
}
}
}
}Windows-MCP is a lightweight, open-source MCP server that bridges large language models with the Windows operating system. It enables AI agents to navigate files, control applications, interact with the UI, test interfaces, and automate a wide range of Windows tasks by exposing native Windows automation capabilities to MCP clients.
You use Windows-MCP by running it on your Windows machine and connecting your MCP client to the server. The server exposes a set of tools that let your agent click, type, move the mouse, launch applications, read the clipboard, capture UI state, and interact with the Windows registry and processes. Use the provided client configurations to connect from Claude Desktop, Perplexity Desktop, Gemini CLI, Qwen Code, or Codex CLI. Choose the local mode for direct access or remote mode to route requests through a cloud VM.
Prerequisites you need before installation:
The server can be run via multiple MCP client integrations. Use the exact commands shown in the connection snippets for each client to ensure proper startup.
Local mode runs Windows-MCP directly on your Windows machine and exposes its tools to your connected MCP client. This is the standard setup for personal use.
Remote mode lets Windows-MCP act as a proxy that connects to a cloud-hosted Windows automation service. This is useful when your MCP client is remote and requests are routed to a Windows VM running Windows-MCP.
Direct connections use stdio transport by default. You can also enable network-accessible options such as Server-Sent Events (SSE) or streamable HTTP for production deployments.
Windows-MCP offers a rich toolset to automate Windows interactions: click at coordinates, type text, scroll regions, move the mouse or drag, press keyboard shortcuts, wait for a duration, capture a comprehensive UI snapshot with optional DOM and vision enhancements, launch apps, execute PowerShell commands, scrape web pages, select or edit multiple items, access the clipboard, manage processes, display notifications, and read or modify Windows Registry values.
Windows-MCP operates with full system access and can perform irreversible actions. Review security guidelines before deployment and enable telemetry controls in the client configuration if desired.
Some operations rely on accessibility trees and may have limitations when selecting text inside individual paragraphs. Typing with the Type-Tool is designed for simple text input rather than full IDE coding. This MCP server is not intended for gaming.
The runtime can be started with standard IO transport or network transports. If you run from source, ensure you point to the correct directory and include the appropriate directory path in the start command.
Claude Desktop and other MCP clients can connect by adding an MCP server configuration that specifies the command and arguments to start Windows-MCP. The examples below show how to wire these connections in common clients.
In Local mode, run the server directly and connect your MCP client to the local host using the default stdio transport. In Remote mode, provide the necessary environment variables to point to the cloud-hosted Windows VM and supply your API key.
The following connection configurations illustrate how to start Windows-MCP with common MCP clients. Use these exact commands in your client setup to establish a stable connection.
Click on the screen at the given coordinates.
Type text on an element, with optional clearing of existing text.
Scroll vertically or horizontally within the window or a region.
Move the mouse or perform drag actions to coordinates.
Press keyboard shortcuts such as Ctrl+C or Alt+Tab.
Pause execution for a defined duration.
Capture a comprehensive state snapshot including UI elements and a screenshot; supports DOM mode for web content and vision mode for screenshots.
Launch an application from the Start Menu, resize/move windows, and switch between apps.
Execute PowerShell commands.
Scrape information from a webpage.
Select multiple items with optional Ctrl key.
Enter text into multiple input fields at specified coordinates.
Read or set Windows clipboard contents.
List running processes or terminate them by PID or name.
Send Windows toast notifications.
Read, write, delete, or list Windows Registry values and keys.