home / skills / lin-a1 / skills-agent / ocr_service
This skill performs high-precision OCR on images, supports multiple languages and formats, and returns text, coordinates, and confidence scores for document
npx playbooks add skill lin-a1/skills-agent --skill ocr_serviceReview the files below or copy the command above to add this skill to your agents.
---
name: ocr-service
description: 高精度光学字符识别(OCR)服务。支持多语言、多格式图像的文字检测与提取,并提供文本区域坐标与置信度评分,适用于文档数字化与图像内容分析。
---
## 功能
从图像中提取文字内容,支持多种图像格式和语言。
## 调用方式
```python
from services.ocr_service.client import OCRServiceClient
client = OCRServiceClient()
# 健康检查
status = client.health_check()
# OCR识别
image_base64 = client.image_to_base64("/path/to/image.jpg")
result = client.ocr(image_base64)
# 获取识别结果
texts = result["rec_texts"] # ["识别的文字1", "识别的文字2", ...]
scores = result["rec_scores"] # [0.98, 0.95, ...]
```
## 返回格式
```json
{
"doc_preprocessor_res": {"angle": 0},
"dt_polys": [[x1,y1], [x2,y2], ...],
"rec_texts": ["识别的文字1", "识别的文字2"],
"rec_scores": [0.98, 0.95]
}
```
## 字段说明
- `rec_texts`: 识别出的文字列表
- `rec_scores`: 每个文字块的置信度
- `dt_polys`: 检测到的文本区域坐标
This skill provides a high-precision optical character recognition (OCR) service for extracting text from images. It supports multiple languages and image formats, returns detected text with bounding polygon coordinates and confidence scores, and is designed for document digitization and image content analysis. The service includes a health-check endpoint and simple client functions for image conversion and OCR invocation.
The client sends base64-encoded images to the OCR service, which performs text detection and recognition. The service returns a structured result containing preprocessor info (e.g., rotation), detected text polygons, recognized text strings, and per-item confidence scores. Consumers can use the polygon coordinates to map text back onto the original image or to crop regions for downstream processing.
What does rec_scores represent and how should I use it?
rec_scores is the per-text confidence value (0–1). Use a threshold to filter low-confidence results or surface them for manual review.
How do I handle rotated or skewed images?
Check doc_preprocessor_res.angle to detect rotation. Deskew using that angle or reorient the image before downstream layout-sensitive processing.