home / skills / openclaw / skills / tesseract-ocr

tesseract-ocr skill

/skills/whalefell/tesseract-ocr

This skill extracts text from images using the Tesseract CLI, supporting multiple languages without Python dependencies.

npx playbooks add skill openclaw/skills --skill tesseract-ocr

Review the files below or copy the command above to add this skill to your agents.

Files (2)
SKILL.md
1.9 KB
---
name: tesseract-ocr
description: |
  Extract text from images using the Tesseract OCR engine directly via command line.
  Supports multiple languages including Chinese, English, and more. Use this skill 
  when users need to extract text from images, recognize text content in images, 
  or perform OCR tasks without Python dependencies.
---

# Tesseract OCR Skill

Extract text content from images using the Tesseract engine directly via command line.

## Features

- Extract text from image files using native tesseract CLI
- Support multi-language recognition (Chinese, English, etc.)
- No Python dependencies required
- Simple and fast

## Dependencies

Install Tesseract OCR system package:

```bash
# Ubuntu/Debian:
sudo apt-get install tesseract-ocr tesseract-ocr-chi-sim

# macOS:
brew install tesseract tesseract-lang
```

## Usage

### Basic Usage

```bash
# Use default language (English)
tesseract /path/to/image.png stdout

# Specify language (Chinese + English)
tesseract /path/to/image.png stdout -l chi_sim+eng

# Save to file
tesseract /path/to/image.png output.txt -l chi_sim+eng

# Multiple languages
tesseract /path/to/image.png stdout -l chi_sim+eng+jpn
```

### Common Language Codes

| Language | Code |
|----------|------|
| Simplified Chinese | chi_sim |
| Traditional Chinese | chi_tra |
| English | eng |
| Japanese | jpn |
| Korean | kor |
| Chinese + English | chi_sim+eng |

### Quick Examples

```bash
# OCR with Chinese support
tesseract image.jpg stdout -l chi_sim

# OCR with mixed Chinese and English
tesseract image.png stdout -l chi_sim+eng

# Save to file instead of stdout
tesseract document.png result -l chi_sim+eng
# Creates result.txt
```

## Notes

1. OCR accuracy depends on image quality; use clear images for best results
2. Complex layouts (tables, multi-column) may require post-processing
3. Chinese recognition requires the tesseract-ocr-chi-sim language pack
4. Language packs must be installed separately on your system

Overview

This skill extracts text from image files using the native Tesseract OCR command-line engine. It supports multiple languages (including Chinese and English) and runs without Python dependencies. It is lightweight and suitable for quick OCR tasks directly from the shell.

How this skill works

The skill invokes the tesseract CLI to process image files and returns recognized text to stdout or saves it to a file. You can specify one or more language packs (for example chi_sim+eng) to improve recognition for mixed-language images. Results depend on installed tesseract language data and the input image quality.

When to use it

  • Quickly convert scanned images or photos to searchable text without installing Python libraries
  • Batch OCR operations in shell scripts or server-side pipelines
  • Recognizing Chinese, English, or mixed-language text when corresponding language packs are installed
  • Lightweight preprocessing before more advanced layout or NLP post-processing
  • Extracting text on systems where only CLI tools are allowed or preferred

Best practices

  • Install tesseract system package and required language packs (e.g., tesseract-ocr-chi-sim) before running OCR
  • Use high-resolution, well-lit images with minimal skew for best accuracy
  • Preprocess images (deskew, denoise, binarize) to improve recognition on noisy inputs
  • Specify exact language codes to guide the engine for mixed-language documents
  • Post-process OCR output for layout, table extraction, or error correction when needed

Example use cases

  • Extract text from a photographed receipt using: tesseract receipt.jpg stdout -l eng
  • Process a batch of scanned invoices in a shell loop and save outputs to .txt files
  • Recognize mixed Chinese and English text: tesseract page.png stdout -l chi_sim+eng
  • Automate OCR in a CI job or server script where Python packages are not available
  • Preprocess images with open-source tools, then run tesseract to obtain raw text for indexing

FAQ

Do I need Python to use this skill?

No. The skill uses the tesseract command-line tool directly and has no Python runtime dependency.

How do I enable Chinese recognition?

Install the appropriate Tesseract language pack (for example tesseract-ocr-chi-sim) and specify -l chi_sim when running tesseract.

How can I improve OCR accuracy?

Use clear, high-resolution images, apply preprocessing (deskew, denoise), and specify the correct language codes.