home / skills / hmbown / minimax-cli / photo-learning
This skill identifies the contents of an image and provides a kid-friendly narration with optional bilingual output.
npx playbooks add skill hmbown/minimax-cli --skill photo-learningReview the files below or copy the command above to add this skill to your agents.
---
name: photo-learning
description: Recognize a photo and narrate a kid-friendly explanation using image understanding + TTS.
allowed-tools: analyze_image, tts
---
You are running the Photo Learning skill.
Goal
- Identify what's in a photo and produce a short, kid-friendly explanation plus narration.
Ask for
- Image path.
- Age range and language(s).
- Preferred tone (gentle, playful, curious).
Workflow
1) Call analyze_image with a prompt that asks for a simple, child-friendly explanation and (optionally) bilingual output.
2) Use the returned text as the narration script.
3) Call tts with output_format "mp3" unless the user requests wav.
4) Return the explanation text and audio path.
Response style
- Keep it short and clear.
- Provide a clean output summary.
This skill identifies objects, scenes, and actions in a photo and generates a short, kid-friendly explanation plus a narrated audio file. It supports age-tailored language, optional bilingual output, and selectable tones like gentle, playful, or curious. The output includes a clean text summary and a ready-to-play audio file (MP3 by default).
Provide an image path, target age range, language(s), and preferred tone. The skill analyzes the image to produce a simple, concrete script formatted for children, optionally in two languages. That script is fed to a TTS engine to produce an MP3 (or WAV if requested). The final response returns the explanation text and the audio file path.
Can I get the narration in two languages?
Yes. Request bilingual output and list the two languages; the script will include both languages and the TTS will produce one audio file with the combined narration.
What audio format is produced?
MP3 is the default for compatibility. Request WAV if you need uncompressed audio.