home / mcp / owlocr mcp server
MCP server enabling PDF and image OCR on macOS with OwlOCR CLI or Vision Framework backends.
Configuration
View docs{
"mcpServers": {
"jangisaac-dev-owlocr-mcp": {
"command": "uv",
"args": [
"run",
"--directory",
"/path/to/owlocr-mcp",
"owlocr-mcp"
],
"env": {
"UV_DIRECTORY": "/path/to/owlocr-mcp"
}
}
}
}You can run an MCP server that provides OCR capabilities for PDFs and images on macOS, combining OwlOCR for higher accuracy with the Vision Framework for a zero-dependency option. This server lets you process documents asynchronously from MCP clients, automatically selecting the best available backend and delivering OCR results page by page or image by image.
To use the MCP server, install it on your macOS system and run it so your MCP clients can send OCR tasks. The server exposes two backends for OCR: OwlOCR CLI for higher accuracy and the built-in Vision Framework for no external dependencies. Your MCP clients can request PDF OCR or image OCR and specify languages as needed. The server handles page-by-page extraction and returns structured text with separators between pages.
Prerequisites you need before installing: - macOS - Python 3.11 or newer - Optional: OwlOCR.app for improved accuracy Follow these steps to install using the recommended method. If you want to use a local development workflow, you can run the server directly from your checkout using Python.
Configuration and workflow tips you will find useful include the following: - You can verify OCR backends on your system to choose the best option for accuracy and speed. - The server supports two backends and will auto-select OwlOCR when available, otherwise falling back to Vision Framework. - You can test backends and compare performance with built-in benchmark tooling to determine which backend to rely on in production. - If OwlOCR cannot access files outside its sandbox, the MCP server copies files to a sandbox temp directory automatically to avoid file picker prompts.
Troubleshooting commonly reported issues:
- If OwlOCR.app is not found, install OwlOCR from the official site or fallback to the Vision backend.
- If a file picker dialog appears, this is caused by sandbox restrictions; the MCP server copies files to the sandbox automatically.
- For language-specific accuracy issues with Vision, explicitly specify language codes such as ["ko-KR", "en-US"] or other needed languages when OCR-ing PDFs.
# Option 1: Use UV to run the MCP server from a directory you provide
uv run --directory /path/to/owlocr-mcp owlocr-mcp
# Option 2: Run the MCP server via Python from a virtual environment
/path/to/owlocr-mcp/.venv/bin/python -m owlocr_mcp.serverExtract text from a PDF file, with options for page selection, DPI, backend, and languages.
Extract text from a single image file using the configured OCR backend.
Check which OCR backends are available on the system and report their status.