Home / MCP / Mobile MCP Server
Provides a unified API to automate native iOS/Android apps and devices via structured accessibility data or coordinate-based interactions.
Configuration
View docs{
"mcpServers": {
"mobile_mcp": {
"command": "npx",
"args": [
"-y",
"@mobilenext/mobile-mcp@latest"
]
}
}
}Mobile Next MCP Server provides a unified interface to automate native iOS and Android apps and devices, enabling scalable mobile automation across simulators, emulators, and real devices with a platform-agnostic API.
You connect your MCP client or AI assistant to the Mobile Next MCP Server to automate native mobile apps and device interactions. Use the available MCP tools to list devices, launch and manage apps, take and save screenshots, interact with on-screen elements by coordinates, and simulate user actions like taps, swipes, and text entry. The server is designed to work across iOS and Android, providing a consistent workflow for testing, data entry, and complex multi-step journeys driven by an LLM.
{
"mcpServers": {
"mobile-mcp": {
"command": "npx",
"args": ["-y", "@mobilenext/mobile-mcp@latest"]
}
}
}Prerequisites include development toolchains and runtimes to connect to iOS and Android devices. Ensure you have Xcode command line tools, Android Platform Tools, and a recent Node.js installation. The server can run in headless mode on simulators or emulators when no real device is connected.
List all available devices including simulators, emulators, and real devices.
Return the screen size in pixels for the connected device.
Return the current screen orientation (portrait or landscape).
Set the device orientation to portrait or landscape.
List all installed apps on the target device.
Launch an app by its package name.
Terminate a running app on the device.
Install an app from a file (apk/ipa/app/zip).
Uninstall an app by bundle ID or package name.
Capture a screenshot of the current screen.
Save a screenshot to a file for later inspection.
Enumerate UI elements on screen with coordinates and properties.
Click at a specific x,y coordinate on the screen.
Perform a double-tap at given coordinates.
Long-press at specific coordinates.
Swipe in a direction (up, down, left, right) across the screen.
Type text into the focused element with optional submit behavior.
Press device buttons such as HOME, BACK, VOLUME_UP/DOWN, ENTER, etc.
Open a URL in the device browser.