home / mcp / automac mcp server

AutoMac MCP Server

Provides experimental macOS UI automation via an MCP server with input control and screen comprehension.

Installation
Add the following to your MCP client configuration file.

Configuration

View docs
{
  "mcpServers": {
    "digithree-automac-mcp": {
      "command": "/path/to/automac-mcp/.venv/bin/python",
      "args": [
        "/path/to/automac-mcp/automac_mcp.py"
      ]
    }
  }
}

AutoMac MCP is a Python-based MCP server that lets an AI assistant securely control and automate your macOS UI from a local environment. It exposes a standardized interface for UI actions and screen understanding, enabling hands-free automation of the macOS experience in controlled testing or experimentation.

How to use

You connect a compatible MCP client (such as Claude Desktop) to the AutoMac MCP server to start automating macOS UI. The server accepts tool requests like mouse movements, clicks, keyboard input, scrolling, and screen queries, then returns results that your AI can use to decide next steps. Be mindful of permissions and confirm prompts to keep automation safe.

How to install

{
  "mcpServers": {
    "automac_mcp": {
      "command": "/path/to/automac-mcp/.venv/bin/python",
      "args": ["/path/to/automac-mcp/automac_mcp.py"]
    }
  }
}

Additional sections

Prerequisites and setup steps come from the following practical flow. Install required tooling, add the MCP server to your client configuration, grant macOS accessibility permissions, and then restart your client to begin automation.

Security and safety notes

This experimental setup requires explicit macOS accessibility permissions and relies on command prompts from your AI client to prevent unintended actions. Use in controlled environments for research and monitor automations closely.

Prompting tips

Be explicit about targets, such as which application to focus. After actions in other apps, request switching back to your MCP client to verify results.

Development status and roadmap

Core MCP server features include input control, screen comprehension via accessibility APIs, and OCR-driven text reading. Ongoing work aims at more granular UI detection, advanced interactions, multi-monitor support, and improved visual feedback.

Case study overview

A full example describes opening a Steam wishlist, selecting affordable items, adding to cart, and completing a purchase. This demonstrates end-to-end automation capabilities and how an AI agent can drive UI actions in a real-world workflow.

Available tools

get_screen_size

Return the current screen width and height to help plan coordinates for input actions.

mouse_move

Move the mouse pointer to the specified (x, y) coordinates.

mouse_single_click

Perform a single left-click at the given coordinates.

mouse_double_click

Perform a double-click at the given coordinates.

type_text

Type a string of text into the currently focused input area.

scroll

Scroll the screen by a pixel delta along the x and y axes.

keyboard_shortcut_return_key

Press the Return/Enter key.

keyboard_shortcut_escape_key

Press the Escape key.

keyboard_shortcut_tab_key

Press the Tab key.

keyboard_shortcut_space_key

Press the Space key.

keyboard_shortcut_delete_key

Press the Delete/Backspace key.

keyboard_shortcut_forward_delete_key

Press the Forward Delete key.

keyboard_shortcut_arrow_up

Press the Up Arrow key.

keyboard_shortcut_arrow_down

Press the Down Arrow key.

keyboard_shortcut_arrow_left

Press the Left Arrow key.

keyboard_shortcut_arrow_right

Press the Right Arrow key.

keyboard_shortcut_select_all

Select all text (Cmd+A).

keyboard_shortcut_copy

Copy selected content (Cmd+C).

keyboard_shortcut_paste

Paste from clipboard (Cmd+V).

keyboard_shortcut_cut

Cut selected content (Cmd+X).

keyboard_shortcut_undo

Undo last action (Cmd+Z).

keyboard_shortcut_redo

Redo last undone action (Cmd+Shift+Z).

keyboard_shortcut_save

Save current document (Cmd+S).

keyboard_shortcut_new

Create new document (Cmd+N).

keyboard_shortcut_open

Open document (Cmd+O).

keyboard_shortcut_find

Find in document (Cmd+F).

keyboard_shortcut_close_window

Close current window (Cmd+W).

keyboard_shortcut_quit_app

Quit current application (Cmd+Q).

keyboard_shortcut_minimize_window

Minimize current window (Cmd+M).

keyboard_shortcut_hide_app

Hide current application (Cmd+H).

keyboard_shortcut_switch_app_forward

Switch to next application (Cmd+Tab).

keyboard_shortcut_switch_app_backward

Switch to previous application (Cmd+Shift+Tab).

keyboard_shortcut_spotlight_search

Open Spotlight search (Cmd+Space).

keyboard_shortcut_force_quit

Open Force Quit dialog (Cmd+Option+Esc).

keyboard_shortcut_refresh

Refresh/Reload (Cmd+R).

get_screen_layout

Get window and app layout information via macOS accessibility APIs.

get_screen_text

Read on-screen text using OCR with positioning data.

focus_app

Bring a specific application to the foreground, with optional timeout.

get_available_apps

List all currently running applications.

play_sound_for_user_prompt

Play a system alert sound to signal the user prompt.