home / skills / louloulin / claude-agent-sdk / pdf-processor

This skill helps you extract text, fill forms, and merge PDFs with precision, streamlining document processing tasks.

npx playbooks add skill louloulin/claude-agent-sdk --skill pdf-processor

Review the files below or copy the command above to add this skill to your agents.

Files (7)
SKILL.md
3.3 KB
---
name: pdf-processor
description: Extract text, fill forms, and merge PDF files. Use when working with PDF documents, forms, or when users mention PDF processing.
version: "2.0.0"
author: "Doc Team <[email protected]>"
tags:
  - pdf
  - documents
  - forms
  - data-extraction
dependencies: []

allowed_tools:
  - Read
  - "Bash(python:*)"
  - Grep

model: claude-sonnet-4-20250514
---

# PDF Processing

Expert PDF document processing specialist. Extract text, fill forms, merge documents, and manipulate PDFs with precision.

## Quick Start

Extract text from PDF:

```python
import pdfplumber

with pdfplumber.open("document.pdf") as pdf:
    page = pdf.pages[0]
    text = page.extract_text()
    print(text)
```

## Capabilities

### Text Extraction
- Extract plain text from PDF pages
- Preserve layout and formatting
- Handle multi-page documents
- Extract tables from PDFs

### Form Operations
- Fill PDF forms programmatically
- Extract form field data
- Validate form fields
- Flatten filled forms

### Document Manipulation
- Merge multiple PDFs
- split PDFs into pages
- rotate pages
- add watermarks
- compress PDFs

### OCR Integration
- Process scanned PDFs
- Extract text from images
- Improve OCR accuracy
- Handle multiple languages

## Additional Resources

### Form Field Mappings
For detailed form field mappings and instructions, see [forms.md](forms.md).

### API Reference
For complete API documentation, see [reference.md](reference.md).

### Usage Examples
See [examples.md](examples.md) for more usage examples.

## Utility Scripts

Validate PDF files:
```bash
python scripts/validate.py document.pdf
```

Extract form data:
```bash
python scripts/extract_forms.py document.pdf
```

Merge PDFs:
```bash
python scripts/merge.py output.pdf input1.pdf input2.pdf
```

## Requirements

Ensure required packages are installed:
```bash
pip install pypdf pdfplumber pillow reportlab
```

## Troubleshooting

### Common Issues

**Problem**: Script not found
**Solution**: Ensure scripts have execute permissions: `chmod +x scripts/*.py`

**Problem**: Package not installed
**Solution**: Run pip install with required packages

**Problem**: PDF is encrypted
**Solution**: Unlock the PDF first or provide the password

**Problem**: OCR not working
**Solution**: Install tesseract OCR: `apt-get install tesseract-ocr`

## Best Practices

### DO (Recommended)

1. **Validation**
   - Always validate PDF files before processing
   - Check for encryption and permissions
   - Verify file integrity

2. **Error Handling**
   - Handle corrupted PDFs gracefully
   - Provide meaningful error messages
   - Log processing steps

3. **Performance**
   - Process pages in batches for large PDFs
   - Use multiprocessing when possible
   - Cache extracted data

### DON'T (Avoid)

1. **Security Issues**
   - ❌ Process PDFs from untrusted sources without validation
   - ❌ Execute embedded scripts in PDFs
   - ❌ Ignore encryption warnings

2. **Performance Issues**
   - ❌ Load entire PDF into memory unnecessarily
   - ❌ Process pages sequentially when parallel is possible
   - ❌ Ignore memory limits

3. **Quality Issues**
   - ❌ Skip OCR for scanned documents
   - ❌ Ignore layout and formatting
   - ❌ Assume all PDFs have the same structure

---

**Version**: 2.0.0
**Last Updated**: 2025-01-10
**Maintainer**: Doc Team

Overview

This skill is a PDF processing toolkit implemented for Rust-based Claude agents that extracts text, fills and flattens forms, and merges or manipulates PDF files. It focuses on reliable extraction, form automation, and document composition for workflows that handle digital or scanned PDFs. Use it to automate repetitive PDF tasks, prepare documents for archival, or integrate PDF handling into pipelines.

How this skill works

The skill inspects PDF structure and pages to extract plain text and preserve basic layout. It reads and writes AcroForm fields for programmatic form filling, validates field values, and can flatten forms to create final read-only documents. For document manipulation it merges, splits, rotates, watermarkes, and compresses files, and it integrates OCR when pages are scanned images to recover text.

When to use it

  • Extracting searchable text or tables from multi-page PDFs for indexing or analysis
  • Automatically filling, validating, and flattening PDF forms for batch processing
  • Merging multiple reports, invoices, or forms into a single deliverable
  • Splitting large PDFs into smaller chunks or rotating/annotating pages
  • Processing scanned documents that require OCR to extract text

Best practices

  • Validate PDF integrity and check for encryption or permissions before processing
  • Process large files in paged or batched mode and avoid loading entire documents into memory
  • Use OCR only for scanned images; preserve original text extraction for born-digital PDFs
  • Log processing steps and handle corrupted pages gracefully with retries or fallbacks
  • Flatten filled forms when you need a fixed, non-editable output for distribution

Example use cases

  • Batch-fill tax or enrollment forms from a CSV of user data, validate fields, then flatten and store PDFs
  • Extract tables and text from a set of research reports for downstream data analysis
  • Merge monthly statements into a single annual PDF and add a watermark or page numbers
  • Split a large scanned contract into sections, run OCR, and index extracted clauses for search
  • Compress archived PDFs to reduce storage while preserving readable text and searchable OCR output

FAQ

Can this skill process scanned PDFs?

Yes — it integrates OCR to extract text from images and can improve accuracy by tuning OCR settings and languages.

How do you handle encrypted PDFs?

Encrypted PDFs must be unlocked by providing a password; the skill validates permissions and fails gracefully if access is restricted.