home / skills / heygen-com / skills / heygen
This skill helps you generate AI avatar videos and manage prompts, scenes, and translations via HeyGen API for Remotion workflows.
npx playbooks add skill heygen-com/skills --skill heygenReview the files below or copy the command above to add this skill to your agents.
---
name: heygen
description: |
HeyGen AI video creation API. Use when: (1) Using Video Agent for one-shot prompt-to-video generation, (2) Generating AI avatar videos with /v2/video/generate, (3) Working with HeyGen avatars, voices, backgrounds, or captions, (4) Creating transparent WebM videos for compositing, (5) Polling video status or handling webhooks, (6) Integrating HeyGen with Remotion for programmatic video, (7) Translating or dubbing existing videos, (8) Generating standalone TTS audio with the Starfish model via /v1/audio.
homepage: https://docs.heygen.com/reference/generate-video-agent
allowed-tools: mcp__heygen__*
metadata:
openclaw:
requires:
env:
- HEYGEN_API_KEY
primaryEnv: HEYGEN_API_KEY
---
# HeyGen API
AI avatar video creation API for generating talking-head videos, explainers, and presentations.
## Tool Selection
If HeyGen MCP tools are available (`mcp__heygen__*`), **prefer them** over direct HTTP API calls — they handle authentication and request formatting automatically.
| Task | MCP Tool | Fallback (Direct API) |
|------|----------|----------------------|
| Generate video from prompt | `mcp__heygen__generate_video_agent` | `POST /v1/video_agent/generate` |
| Check video status / get URL | `mcp__heygen__get_video` | `GET /v1/video_status.get` |
| List account videos | `mcp__heygen__list_videos` | `GET /v1/video.list` |
| Generate TTS audio | `mcp__heygen__text_to_speech` | `POST /v1/audio/text_to_speech` |
| List TTS voices | `mcp__heygen__list_audio_voices` | `GET /v1/audio/voices` |
| Delete a video | `mcp__heygen__delete_video` | `DELETE /v1/video.delete` |
If no HeyGen MCP tools are available, use direct HTTP API calls with `X-Api-Key: $HEYGEN_API_KEY` header as documented in the reference files.
## Default Workflow
**Prefer Video Agent** for most video requests.
Always use [prompt-optimizer.md](references/prompt-optimizer.md) guidelines to structure prompts with scenes, timing, and visual styles.
**With MCP tools:**
1. Write an optimized prompt using [prompt-optimizer.md](references/prompt-optimizer.md) → [visual-styles.md](references/visual-styles.md)
2. Call `mcp__heygen__generate_video_agent` with prompt and config (duration_sec, orientation, avatar_id)
3. Call `mcp__heygen__get_video` with the returned video_id to poll status and get the download URL
**Without MCP tools (direct API):**
1. Write an optimized prompt using [prompt-optimizer.md](references/prompt-optimizer.md) → [visual-styles.md](references/visual-styles.md)
2. `POST /v1/video_agent/generate` — see [video-agent.md](references/video-agent.md)
3. `GET /v1/video_status.get?video_id=<id>` — see [video-status.md](references/video-status.md)
Only use v2/video/generate when user explicitly needs:
- Exact script without AI modification
- Specific voice_id selection
- Different avatars/backgrounds per scene
- Precise per-scene timing control
- Programmatic/batch generation with exact specs
## Quick Reference
| Task | MCP Tool | Read |
|------|----------|------|
| Generate video from prompt (easy) | `mcp__heygen__generate_video_agent` | [prompt-optimizer.md](references/prompt-optimizer.md) → [visual-styles.md](references/visual-styles.md) → [video-agent.md](references/video-agent.md) |
| Generate video with precise control | — | [video-generation.md](references/video-generation.md), [avatars.md](references/avatars.md), [voices.md](references/voices.md) |
| Check video status / get download URL | `mcp__heygen__get_video` | [video-status.md](references/video-status.md) |
| Add captions or text overlays | — | [captions.md](references/captions.md), [text-overlays.md](references/text-overlays.md) |
| Transparent video for compositing | — | [video-generation.md](references/video-generation.md) (WebM section) |
| Generate standalone TTS audio | `mcp__heygen__text_to_speech` | [text-to-speech.md](references/text-to-speech.md) |
| List TTS voices | `mcp__heygen__list_audio_voices` | [voices.md](references/voices.md) |
| Translate/dub existing video | — | [video-translation.md](references/video-translation.md) |
| Use with Remotion | — | [remotion-integration.md](references/remotion-integration.md) |
## Reference Files
### Foundation
- [references/authentication.md](references/authentication.md) - API key setup and X-Api-Key header
- [references/quota.md](references/quota.md) - Credit system and usage limits
- [references/video-status.md](references/video-status.md) - Polling patterns and download URLs
- [references/assets.md](references/assets.md) - Uploading images, videos, audio
### Core Video Creation
- [references/avatars.md](references/avatars.md) - Listing avatars, styles, avatar_id selection
- [references/voices.md](references/voices.md) - Listing voices, locales, speed/pitch
- [references/scripts.md](references/scripts.md) - Writing scripts, pauses, pacing
- [references/video-generation.md](references/video-generation.md) - POST /v2/video/generate and multi-scene videos
- [references/video-agent.md](references/video-agent.md) - One-shot prompt video generation
- [references/prompt-optimizer.md](references/prompt-optimizer.md) - Writing effective Video Agent prompts (core workflow + rules)
- [references/visual-styles.md](references/visual-styles.md) - 20 named visual styles with full specs
- [references/prompt-examples.md](references/prompt-examples.md) - Full production prompt example + ready-to-use templates
- [references/dimensions.md](references/dimensions.md) - Resolution and aspect ratios
### Video Customization
- [references/backgrounds.md](references/backgrounds.md) - Solid colors, images, video backgrounds
- [references/text-overlays.md](references/text-overlays.md) - Adding text with fonts and positioning
- [references/captions.md](references/captions.md) - Auto-generated captions and subtitles
### Advanced Features
- [references/templates.md](references/templates.md) - Template listing and variable replacement
- [references/video-translation.md](references/video-translation.md) - Translating videos and dubbing
- [references/text-to-speech.md](references/text-to-speech.md) - Standalone TTS audio with Starfish model
- [references/photo-avatars.md](references/photo-avatars.md) - Creating avatars from photos
- [references/webhooks.md](references/webhooks.md) - Webhook endpoints and events
### Integration
- [references/remotion-integration.md](references/remotion-integration.md) - Using HeyGen in Remotion compositions
This skill provides a concise interface and best practices for generating AI avatar videos with HeyGen. It covers using the Video Agent for one-shot prompt-to-video generation and the v2/video/generate endpoint when you need exact per-scene control. The skill also explains creating transparent WebM outputs, standalone TTS audio, and integrating HeyGen output into Remotion pipelines.
Use the Video Agent API (POST /v1/video_agent/generate) for most prompt-driven video requests; structure prompts with scenes, timing, and visual style using prompt-optimizer guidelines. Use POST /v2/video/generate when you require strict script fidelity, specific voice_id or avatar/background per scene, or programmatic batch generation. Poll video status endpoints or use webhooks to detect completion and retrieve download URLs. For audio-only, call the Starfish TTS model via /v1/audio to produce standalone speech.
When should I choose Video Agent vs v2/video/generate?
Use Video Agent for quick, AI-curated prompt-to-video creation. Choose v2/video/generate when you need exact script fidelity, specific voice/avatar per scene, or strict per-scene timing.
How do I get notified when a video is ready?
Either poll the video status endpoint following recommended patterns or configure webhooks to receive completion events and download URLs.