home / mcp / simple document processing mcp server
MCP server that provides doc forge capabilities
Configuration
View docs{
"mcpServers": {
"cablate-mcp-doc-forge": {
"command": "npx",
"args": [
"-y",
"@cablate/mcp-doc-forge"
]
}
}
}The Simple Document Processing MCP Server lets you read, convert, and clean up documents and extract structural information, all through a lightweight MCP (Model Context Protocol) interface. It enables you to read DOCX/PDF/TXT/HTML/CSV content, convert between formats, perform text processing tasks, and clean or extract HTML resources, all while integrating with MCP clients for automated workflows.
You interact with the server through an MCP client by configuring an MCP connection that points to the server and by starting the local process that serves as the MCP endpoint. You can then send requests to read documents, convert between formats, perform text cleanup and diffing, or extract HTML resources. Use the client to orchestrate a sequence of actions such as reading a PDF, converting it to HTML for web publishing, cleaning up the HTML, and summarizing or diffing the resulting text.
Prerequisites: ensure you have Node.js installed on your system. You can verify by running node -v and npm -v in your terminal.
Option A: Install via Smithery (recommended for automated client integration):
npx -y @smithery/cli install @cablate/mcp-doc-forge --client claudeOption B: Manual installation (global install):
npm install -g @cablate/mcp-doc-forgeAfter installation, you can start the server or use it via MCP tooling as described in the usage section.
To make the server available to your MCP client, you can add it as an MCP server. The following example shows how to configure the server to run locally via an MCP client like Dive Desktop. This configuration uses the local runtime (stdio) method.
{
"mcpServers": {
"searxng": {
"command": "npx",
"args": [
"-y",
"@cablate/mcp-doc-forge"
],
"enabled": true
}
}
}Limit access to the local MCP server to trusted clients. Use network firewalls or client authentication when exposing the server beyond your development machine. Keep dependencies up to date and monitor for security advisories related to Node.js packages you install.
If you encounter issues starting the MCP server, verify Node.js and npm versions, ensure the installation completed without errors, and confirm that your MCP client configuration points to the correct command and arguments. Restart the client after any installation or configuration change.
Read documents in DOCX, PDF, TXT, HTML, and CSV formats to extract content for processing.
Convert between document formats such as DOCX to HTML or PDF, and HTML to TXT or Markdown.
Merge and split PDF files to manage document assemblies.
Handle multi-encoding transfers (UTF-8, Big5, GBK), format and clean text, and generate diffs.
Split large texts by lines or specified delimiters to enable parallel or incremental processing.
Clean, format, and extract resources (images, links, videos) from HTML while preserving structure.