home / mcp / mokupdf mcp server

Mokupdf MCP Server

MokuPDF MCP Server provides PDF text/image extraction, OCR, page-by-page processing, and smart search for AI workflows.

Installation
Add the following to your MCP client configuration file.

Configuration

View docs
{
  "mcpServers": {
    "jameslovespancakes-mokupdf": {
      "command": "python",
      "args": [
        "-m",
        "mokupdf",
        "--base-dir",
        "./documents",
        "--max-file-size",
        "200"
      ]
    }
  }
}

MokuPDF is an MCP compatible server that lets AI applications read, search, and process PDF files with advanced text, image, and OCR capabilities. It supports intelligent file discovery, page-by-page processing for large documents, and easy integration with MCP clients to create powerful PDF workflows.

How to use

You connect to MokuPDF from your MCP client to open PDFs, extract text and images, search within documents, and retrieve metadata. Use the available MCP tools to open a PDF, read pages with text and images, search within the current document, or get specific page text and metadata. Close the document when you are done to free memory.

How to install

Prerequisites: Python is installed on your system. You will also need a working MCP client to communicate with the server.

# Clone the project repository
git clone https://github.com/jameslovespancakes/mokupdf.git
cd mokupdf

# Install the package
pip install .

# Or install in development mode
pip install -e .

# Optional: install with OCR support (requires Tesseract)
pip install mokupdf[ocr]

MCP configuration

Add MokuPDF to your MCP configuration to enable control from your MCP client. The example below shows how to run MokuPDF as a local (stdio) server using Python.

{
  "mcpServers": {
    "mokupdf": {
      "command": "python",
      "args": ["-m", "mokupdf"]
    }
  }
}

Available tools

open_pdf

Open a PDF file for processing by the server.

read_pdf

Read PDF pages, extracting text and images with optional page ranges and limits.

search_text

Search for text inside the currently opened PDF with support for case sensitivity.

get_page_text

Extract text from a specific page in the active PDF.

get_metadata

Retrieve metadata from the active PDF.

close_pdf

Close the currently opened PDF to free resources.

find_pdf_files

Find PDF files using intelligent search across common locations.