home / skills / openclaw / skills / image-ocr

image-ocr skill

safe

This skill extracts text from images using Tesseract OCR, supporting multiple languages and common formats to enable quick content digitization.

npx playbooks add skill openclaw/skills --skill image-ocr

Review the files below or copy the command above to add this skill to your agents.

Files (2)

SKILL.md

828 B

---
name: image-ocr
description: "Extract text from images using Tesseract OCR"
metadata:
  {
    "openclaw":
      {
        "emoji": "👁️",
        "requires": { "bins": ["tesseract"] },
        "install":
          [
            {
              "id": "dnf",
              "kind": "dnf",
              "package": "tesseract",
              "bins": ["tesseract"],
              "label": "Install via dnf",
            },
          ],
      },
  }
---

# Image OCR

Extract text from images using Tesseract OCR. Supports multiple languages and image formats including PNG, JPEG, TIFF, and BMP.

## Commands

```bash
# Extract text from an image (default: English)
image-ocr "screenshot.png"

# Extract text with a specific language
image-ocr "document.jpg" --lang eng
```

## Install

```bash
sudo dnf install tesseract
```

Overview

This skill extracts text from images using Tesseract OCR. It supports multiple image formats (PNG, JPEG, TIFF, BMP) and can run OCR in different languages. The tool is lightweight and designed for batch or single-image text extraction in automation pipelines.

How this skill works

The skill calls Tesseract OCR on supplied image files and returns the recognized text. You can pass a language code to improve accuracy for non-English content. It accepts common image formats and can be invoked for single files or scripted to process folders in bulk.

When to use it

Digitizing screenshots, scanned documents, or photos to searchable text
Automating data extraction from receipts, forms, or invoices
Preprocessing images for natural language processing or indexing
Quickly converting single images into editable text for editing or translation
Running offline OCR where cloud services are not an option

Best practices

Use clean, high-resolution images and crop to the region of interest to improve accuracy
Specify the correct language code (e.g., eng) when text is not English
Preprocess images (deskew, enhance contrast, remove noise) before OCR for better results
Batch-process images with consistent naming and output paths for reliable automation
Validate and post-process extracted text (spellcheck, regex extraction) for critical workflows

Example use cases

Extracting text from a scanned contract folder to create a searchable archive
Converting screenshots of error messages into text for bug reports
Pulling line items from photographed receipts for expense tracking
Preparing OCRed text for translation or content analysis
Integrating with a pipeline that indexes documents for full-text search

FAQ

What image formats are supported?

Common formats are supported: PNG, JPEG, TIFF, and BMP.

How do I improve OCR accuracy for non-English text?

Install the appropriate Tesseract language data and pass the corresponding language code (for example, --lang eng for English).