home / mcp / pdf processor mcp server

PDF Processor MCP Server

MCP PDF Processor , Fetches, proceses to llm.txt, and loads the llm.txt to your AI

Installation

Add the following to your MCP client configuration file.

Configuration

{
  "mcpServers": {
    "michaellevinson-mcp_pdf_processor": {
      "command": "python",
      "args": [
        "pdf_tool_server.py"
      ],
      "env": {
        "OUTPUT_DIR": "llm_output",
        "PYTHONPATH": "/path/to/mcp_pdf_processor"
      }
    }
  }
}

You can run a dedicated PDF processing MCP server that fetches PDFs, extracts text, and recognizes LaTeX equations. This server is designed to work with Claude via MCP, enabling seamless fetch, analyze, and read workflows for documents.

How to use

Use the MCP client to connect to the PDF processing server and perform end-to-end workflows. You can fetch a PDF, request LaTeX extraction, and then read or summarize the processed content. In Claude, you will typically issue commands like fetching a PDF from a URL, triggering processing with LaTeX extraction, and then asking Claude to read or analyze the resulting content.

How to install

Prerequisites: Make sure you have Python 3.9 or higher and pip installed. You will also need the MCP tooling if you plan to register or manage the server via the MCP CLI.

Standard Python installation to install the server in editable mode:

pip install -e .

To use with Claude Desktop or Claude Code, install the MCP CLI and then install the server in editable mode using the MCP tool. These steps enable you to register and work with the server from Claude.

pip install "mcp[cli]"

mcp install /path/to/pdf_tool_server.py --with-editable /path/to/mcp_pdf_processor

For development with the MCP Inspector, you can run this command to start a development session for the server.

mcp dev /path/to/pdf_tool_server.py --with-editable /path/to/mcp_pdf_processor

To run the server standalone for testing or local use, start the Python script directly.

python pdf_tool_server.py

Configuration and usage notes

Environment variables control where processed PDFs are stored and how Python paths are resolved. The following variables are used by the server if you choose to customize them:

- OUTPUT_DIR: Directory to store processed PDFs (default: llm_output)

- PYTHONPATH: Set to the directory containing the mcp_pdf_processor package

When the server is registered with Claude, you can invoke it with a variety of MCP commands to fetch, process, and read PDFs. A typical workflow looks like this: fetch a PDF, process it with LaTeX extraction, then read the processed output.

{
  "mcpServers": {
    "pdf_tool": {
      "type": "stdio",
      "name": "pdf_tool",
      "command": "python",
      "args": ["pdf_tool_server.py"]
    }
  },
  "envVars": [
    {"name": "OUTPUT_DIR", "description": "Directory to store processed PDFs", "example": "llm_output"},
    {"name": "PYTHONPATH", "description": "Path to mcp_pdf_processor package", "example": "/path/to/mcp_pdf_processor"}
  ]
}

Notes on usage with Claude

Once you register the server, you can direct Claude to perform tasks like fetching and analyzing a PDF, extracting LaTeX equations, or summarizing content from a PDF. Use clear prompts to specify the URL and the desired processing (for example, enabling LaTeX extraction) so Claude can coordinate the MCP interactions for you.

Available tools

fetch_pdf

Fetches a PDF from a URL and returns a hash identifier for subsequent processing.

process_pdf

Processes the fetched PDF, with options to extract LaTeX equations and other analyses.

read_processed_pdf

Reads the content of a previously processed PDF, returning processed output for review or summarization.