home / skills / richardanaya / agent-skills / describe-image

describe-image skill

safe

/.opencode/skill/describe-image

This skill analyzes an image using a local tool to describe details based on a prompt, without overloading GPU.

npx playbooks add skill richardanaya/agent-skills --skill describe-image

Review the files below or copy the command above to add this skill to your agents.

Files (1)

SKILL.md

387 B

---
name: describe-image
description: Uses a local model to describe something about an image
license: MIT
compatibility: opencode
metadata:
  audience: tools
---

There is a local CLI tool describe_image that uses a local AI to describe an image. Don't use this tool in parallel or we might overwhelm our GPU.

```
describe_image <disk path> "<prompt to ask about details in image>"
```

Overview

This skill describes images using a local AI model to extract visible details, objects, and context. It runs a command-line tool against image files and returns concise, human-readable descriptions for accessibility, indexing, or analysis. It is optimized for privacy and on-premise use since processing happens locally.

How this skill works

The skill calls a local CLI tool that accepts a disk path to the image and a short prompt asking what to describe. The tool runs on a local GPU-backed model and returns descriptive text about the scene, objects, attributes, and any requested focus details. Avoid running multiple instances in parallel to prevent GPU overload.

When to use it

Generate alt text or accessibility descriptions for images
Extract scene or object details for indexing and search
Quickly summarize photo contents for content moderation or triage
Produce image annotations for datasets or QA checks
Obtain focused details by asking specific prompts about the image

Best practices

Provide a concise, specific prompt about what you want described (e.g., "Describe people and emotions").
Use local absolute disk paths to images to ensure the tool can access files reliably.
Run requests serially rather than in parallel to avoid overwhelming the GPU.
Pre-check image formats and sizes; downscale very large images if needed to speed processing.
Validate outputs for sensitive or ambiguous content before automated use.

Example use cases

Create accessible alt text for web images by asking for short descriptions.
Index photo libraries by extracting objects, locations, and visible tags.
Assist content moderators by highlighting potentially problematic content in images.
Annotate training datasets with object labels and scene summaries.
Perform rapid QA on product photos to confirm presence and orientation of items.

FAQ

How do I invoke the tool?

Run the local CLI with the image disk path and a short prompt, for example: describe_image /path/to/image "What is visible in this photo?".

Can I run multiple requests at once?

No — avoid parallel runs. The model uses the local GPU and concurrent jobs can overwhelm resources and slow or fail processing.