home / mcp / screen text mcp server

Screen Text MCP Server

Provides screen capture and OCR capabilities to capture screens, capture app windows, and extract text via OCR.

Installation

Add the following to your MCP client configuration file.

Configuration

{
  "mcpServers": {
    "ikhide-screen-capture-mcp": {
      "command": "node",
      "args": [
        "path/to/mcp-screen-text/dist/index.js"
      ]
    }
  }
}

MCP Screen Text provides screen capture and optical character recognition (OCR) capabilities, enabling you to take screenshots of your desktop or specific applications and extract text from images. This makes it easy to automate text capture from your workspace and integrate it with other MCP clients.

How to use

You can access the Screen Text server through any MCP client that supports stdio transport. Use the following core capabilities to capture and read text from your screen or from application windows: capture_screen to take a full-screen or display-specific capture, capture_application_screen to grab a single application's window, extract_text to run OCR on an existing image, and capture_screen_and_extract_text to perform both actions in one step. If you want to automate text extraction directly after capturing, prefer capture_screen_and_extract_text and specify the desired language for OCR (for example, eng for English or spa for Spanish). And if you need to see which windows or applications are available for capture, you can use list_applications.

How to install

Prerequisites: ensure you have Node.js and npm installed on your machine.

npm install
```} ,{

Build and run in development or start the built server when ready.

npm run build
```

```
npm run dev
```

```
npm start

Usage with an MCP client

Configure your MCP client to connect to the Screen Text server using stdio transport. You can provide the following example configuration to your client, which runs the server executable via Node and points to the built distribution.

{
  "mcpServers": {
    "screen-text": {
      "command": "node",
      "args": ["path/to/mcp-screen-text/dist/index.js"]
    }
  }
}

Available tools

capture_screen

Captures a screenshot of the entire screen or a specific display. Optional display index and image format (png or jpg) control the output.

capture_application_screen

Captures a screenshot of a specific application window by name. You can specify the target application and image format.

list_applications

Lists all running applications that can be captured, enabling you to choose targets for application-specific screenshots.

extract_text

Runs OCR on a provided image file to extract readable text, supporting multiple languages like eng, spa, or fra.

capture_screen_and_extract_text

Captures a screenshot (full screen or a specific application window) and performs OCR in one operation, returning both the image and the extracted text.