home / mcp / linux desktop mcp server

Linux Desktop MCP Server

Provides keyboard/mouse automation for Linux desktop applications via AT-SPI2, enabling semantic element targeting and natural language searches.

Installation
Add the following to your MCP client configuration file.

Configuration

View docs
{
  "mcpServers": {
    "beckhamlabsllc-linux-desktop-mcp": {
      "command": "linux-desktop-mcp",
      "args": []
    }
  }
}

You run a local MCP server that exposes Chrome-extension-like semantic element targeting for native Linux desktop applications. It uses AT-SPI2 to discover UI elements, detect roles and states, and perform actions like clicking and typing across X11, Wayland, and XWayland. This lets you automate interactions with GTK, Qt, Electron apps and search for UI components by natural language.

How to use

Install and run a local MCP server to automate desktop applications. You will connect an MCP client to the server and use high-level actions such as taking a snapshot of the UI tree, finding elements by natural language, and performing click or type operations on specific elements.

How to install

Prerequisites you need before starting are Python and the system packages for accessibility and input simulation.

# System dependencies (Ubuntu/Debian)
sudo apt install python3-pyatspi gir1.2-atspi-2.0 at-spi2-core

# For X11 input simulation
sudo apt install xdotool

# For Wayland input simulation (recommended)
# Install ydotool from source or your package manager
# Then start the daemon:
sudo ydotoold &

Configuration

Configure the MCP server you want to run by specifying the command to start it in your MCP client settings.

{
  "mcpServers": {
    "linux_desktop": {
      "command": "linux-desktop-mcp"
    }
  }
}

Additional configuration for source installations

If you installed from source, use the Python module start option in your MCP client settings.

{
  "mcpServers": {
    "linux_desktop": {
      "command": "python",
      "args": ["-m", "linux_desktop_mcp"]
    }
  }
}

Available tools and common actions

Use the following core tools to interact with desktop applications. Each tool returns references to UI elements and their current states so you can perform precise actions.

Troubleshooting

If AT-SPI2 cannot be found or the accessibility registry is not running, install the required packages and ensure accessibility is enabled on your desktop. For Wayland setups, ensure a compatible input backend is started (ydotool) and running in the background.

Platform notes

The MCP server works with AT-SPI2-capable applications across X11, Wayland, and XWayland. It can detect roles like buttons and text fields, track focus and editability, and perform clicks or keystrokes through supported input backends.

Architecture

The system is designed with a protocol layer that communicates via JSON-RPC over standard input/output, a reference manager that maps element references (ref_1, ref_2, …), an AT-SPI2 backend for discovery, and input backends such as ydotool and xdotool to simulate user actions.

Privacy and security

All automation runs locally on your machine and does not transmit data externally. Access is limited to the UI elements you target, and credentials are not stored by the MCP server.

Notes

Ensure accessibility is enabled in your desktop environment and that applications expose their UI structure to AT-SPI2 for best results.

Available tools

desktop_snapshot

Capture the accessibility tree with semantic element references for an application, returning a hierarchical view of elements with refs like ref_1, ref_2, etc.

desktop_find

Find elements by natural language queries such as "save button" or "search field" and receive matching elements with references and actions.

desktop_click

Click an element by reference or coordinates, with options for mouse button, click type, and modifiers.

desktop_type

Type text into a focused element, with options to clear first and optionally submit (press Enter) after typing.

desktop_key

Press keyboard keys or shortcuts with optional modifier keys.

desktop_capabilities

Check available automation capabilities of the MCP server.