Provides a cloud-based agentic MCP server to delegate desk tasks, manage sandboxes, and obtain text-based summaries and screen descriptions.
Configuration
View docs{
"mcpServers": {
"taskcrew-cua-mcp-server": {
"url": "https://cua-mcp-server.vercel.app/mcp",
"headers": {
"CUA_MODEL": "claude-opus-4-5",
"CUA_API_KEY": "<YOUR_CUA_API_KEY>",
"CUA_API_BASE": "https://api.cua.ai",
"ANTHROPIC_API_KEY": "sk-...",
"BLOB_READ_WRITE_TOKEN": "<BLOB_READ_WRITE_TOKEN>"
}
}
}
}You can delegate desktop automation tasks to an autonomous agent that runs inside cloud-provisioned sandboxes. This MCP server handles task delegation, screen description, and progressive task execution while ensuring images stay in the sandbox and only text summaries are returned. It enables creating and managing sandboxes, describing current screens, and running complex workflows end-to-end.
Connect to the MCP server from your MCP client or integration. You can run tasks that automate desktop actions inside a sandbox, request a text-based description of the current screen, and poll task progress to see what the agent is doing. Typical flows include starting a sandbox, issuing a task such as opening an application and navigating to a URL, and then querying the screen state to verify results. All actions are performed inside the sandbox and the final output is a concise summary of what was achieved.
Prerequisites: Node.js and npm installed on your system. A modern browser or terminal access for verification is helpful. You also need access to a CUA API key and an Anthropic API key for vision processing if you plan to run images through the vision model.
Step-by-step setup to use an MCP client with this server setup:
1) Obtain a CUA API key from your account and an Anthropic API key for vision processing.
2) Add your MCP server URL to your client configuration as shown in the example below.
Quick start shows how to configure your client to point at the MCP server for remote operation. You can also run the MCP client locally via standard package tooling if you are integrating into an automated workflow.
Key concepts include: starting and listing sandboxes, describing the current screen state with vision, and running autonomous tasks that report back a task_id and a summary.
If you encounter a missing API key error, set CUA_API_KEY in your environment or pass it via the X-CUA-API-Key header in requests.
If the vision component prompt returns an error, ensure your ANTHROPIC_API_KEY is configured and that you have network access to the vision service.
List all CUA cloud sandboxes with their current status
Get details of a specific sandbox including API URLs
Start a stopped sandbox
Stop a running sandbox
Restart a sandbox
Get a text description of current screen state using vision AI. No actions taken.
Execute a computer task autonomously and return a task_id for polling
Poll progress of a running task, including current step and reasoning
Retrieve results of a previously executed task by ID