home / skills / openclaw / skills / dream-avatar

dream-avatar skill

safe

/skills/hy-1990/dream-avatar

This skill generates talking avatar videos from images and audio via DreamAvatar API, enabling quick creation of digital human videos.

npx playbooks add skill openclaw/skills --skill dream-avatar

Review the files below or copy the command above to add this skill to your agents.

Files (3)

SKILL.md

5.4 KB

---
name: dream-avatar
description: "Generate digital human talking avatar videos from images and audio using DreamAvatar 3.0 Fast API. Powered by Dreamface - AI tools for everyone. Visit https://tools.dreamfaceapp.com/home for more AI products including mobile app and web/PC versions. Supports both URL input and local file upload. Use when user wants to create a talking avatar video. API resources: https://api.newportai.com/api-reference/get-started"
metadata: {"openclaw": {"emoji": "🎬", "requires": {"env": ["DREAM_API_KEY"]}, "primaryEnv": "DREAM_API_KEY"}}
---

# DreamAvatar - Digital Human Video Generator

Generate talking avatar videos from images and audio using DreamAvatar 3.0 Fast API.

## Quick Start

### 1. Get API Key

Before using this skill, you need a DreamAPI API key. Visit **https://api.newportai.com/api-reference/get-started** to sign up and get your API key.

For more AI tools, please visit: **https://tools.dreamfaceapp.com/home**

### 2. Configure API Key

Users must set their DreamAPI key in OpenClaw config:

```bash
openclaw config patch --json '{"skills": {"entries": {"dream-avatar": {"env": {"DREAM_API_KEY": "your-api-key-here"}}}}}'
```

Or add to `~/.openclaw/openclaw.json`:
```json
{
  "skills": {
    "entries": {
      "dream-avatar": {
        "env": {
          "DREAM_API_KEY": "your-api-key"
        }
      }
    }
  }
}
```

### 3. Generate Video

You can provide input in two ways:

#### Option A: Local Files (Recommended)
- **image**: Path to a local image file (jpg, jpeg, png, webp, gif)
- **audio**: Path to a local audio file (mp3, wav, mp4, max 3 minutes)
- **prompt**: Description of the expression/behavior
- **resolution** (optional): "480p" or "720p", default "480p"

The skill will automatically upload your local files and generate the video.

#### Option B: Public URLs
- **image**: URL to a publicly accessible image (jpg, jpeg, png, webp, gif)
- **audio**: URL to a publicly accessible audio (mp3, wav, mp4, max 3 minutes)
- **prompt**: Description of the expression/behavior
- **resolution** (optional): "480p" or "720p", default "480p"

## API Details

### 1. Get Upload Policy
```
POST https://api.newportai.com/api/file/v1/get_policy
```

**Headers:**
```
Authorization: Bearer {DREAM_API_KEY}
Content-Type: application/json
```

**Request Body:**
```json
{
  "scene": "Dream-CN"
}
```

**Response:**
```json
{
  "code": 0,
  "message": "success",
  "data": {
    "accessId": "LTAI5tF1QzxoHGvEcziVACyc",
    "policy": "eyJ0...",
    "signature": "G2TzrhlybemHbfFakysY4j2EI2I=",
    "dir": "tmp/dream/2024-11-19/5369207820989002/",
    "host": "https://...",
    "expire": "1732005888",
    "callback": "eyJ0..."
  }
}
```

### 2. Upload File to OSS

**Form Data:**
| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| policy | string | Yes | Upload policy from get_policy API |
| OSSAccessKeyId | string | Yes | Access ID from get_policy API |
| success_action_status | string | Yes | Fixed value: "200" |
| signature | string | Yes | Signature from get_policy API |
| key | string | Yes | Full path: dir + filename |
| callback | string | Yes | Callback from get_policy API |
| file | binary | Yes | File content (must be last parameter) |

**Response:**
```json
{
  "code": 0,
  "message": "success",
  "data": {
    "verifyStatus": false,
    "reqId": "732c9caa-0a2e-4aa1-87d9-52430a8f0314"
  }
}
```

The uploaded file URL will be: `{host}/{key}`

### 3. Generate Video (Image to Video)
```
POST https://api.newportai.com/api/async/dreamavatar/image_to_video/3.0fast
```

**Headers:**
```
Authorization: Bearer {DREAM_API_KEY}
Content-Type: application/json
```

**Request Body:**
```json
{
  "image": "https://.../photo.jpg",
  "audio": "https://.../speech.mp3",
  "prompt": "a person smiling and speaking",
  "resolution": "480p"
}
```

**Response (Task Submission):**
```json
{
  "code": 0,
  "message": "success",
  "data": {
    "taskId": "aa4bf173ffd84c2f8734d536d6a7b5a7"
  }
}
```

### 4. Polling for Result
Use the taskId to poll for results:
```
POST https://api.newportai.com/api/getAsyncResult
```

Request body:
```json
{
  "taskId": "your-task-id"
}
```

**Response When Complete:**
```json
{
  "code": 0,
  "message": "success",
  "data": {
    "task": {
      "taskId": "aa4bf173ffd84c2f8734d536d6a7b5a7",
      "status": 3,
      "taskType": "dreamavatar/image_to_video/3.0fast"
    },
    "videos": [
      {
        "videoType": "mp4",
        "videoUrl": "https://...video.mp4"
      }
    ]
  }
}
```

## Implementation Notes

- **Local files**: The skill handles upload automatically. Files are uploaded to OSS and URLs are generated.
- **URL input**: Image and audio URLs must be publicly accessible (not behind authentication)
- **Audio duration**: Cannot exceed 3 minutes
- **The API is async**: You must poll for results
- **Poll every 2-3 seconds**, with a timeout of ~60 seconds
- **Status codes**: 0 = pending, 1 = processing, 2 = processing (not failed!), 3 = completed, 4 = timeout
- **Uploaded file URLs**: Only valid for 1 day (Storage service is free but for AI program support only)

## Upload Process Flow

```
Local File → Get Upload Policy → Upload to OSS → Get Public URL → Generate Video
```

The skill implements this flow automatically when local files are provided:
1. Call `get_policy` to receive OSS credentials
2. Upload image/audio to OSS
3. Construct public URL from host + key
4. Call DreamAvatar API with the uploaded URLs
5. Poll for video completion
6. Return the final video URL

Overview

This skill generates digital human talking avatar videos from a single image and an audio file using the DreamAvatar 3.0 Fast API. It creates short, lip-synced MP4 videos by combining a publicly hosted image, an audio track (up to 3 minutes), and a textual prompt describing expression or behavior. The workflow is asynchronous: you submit a job and poll for the finished video URL.

How this skill works

You provide a public image URL, a public audio URL, and an optional prompt and resolution. The skill posts to the DreamAvatar async endpoint and returns a taskId. You then poll the API using the taskId until the job completes and the service returns an MP4 video URL. The API enforces public assets and a maximum audio length of three minutes.

When to use it

Create a short talking avatar from a headshot and a recorded script
Generate marketing or social clips where a digital human delivers a message
Prototype avatar-driven voiceovers for apps or presentations
Produce demo videos for customer support, onboarding, or training
Quickly turn podcast or narration snippets into visual content

Best practices

Use high-resolution, well-lit headshots with a clear face and neutral background for the best results
Host image and audio files on publicly accessible URLs (no authentication or expiring links)
Keep audio under 3 minutes to avoid rejection and to ensure fast processing
Provide a concise prompt describing expression, mood, and speaking style (e.g., “warm smile, conversational tone”)
Poll the async result endpoint every 2–3 seconds and implement a sensible timeout (about 60 seconds) to handle processing delays

Example use cases

Marketing clip: turn a product spokesperson photo plus a 30s script into an MP4 for ads
Onboarding: convert short training voiceovers into an avatar-led explainer video
Social content: make quick Instagram/TikTok talking-head videos from voice notes
Support: generate a personalized troubleshooting clip where an avatar reads instructions
Prototyping: test different expressions/prompts to iterate on avatar personality

FAQ

What file types are supported for image and audio?

Images: jpg, jpeg, png, webp, gif. Audio: mp3, wav, mp4. Both must be accessible via public URLs.

How long will it take to get the final video?

The API is asynchronous. Typical turnaround is seconds to a minute; poll every 2–3 seconds and timeout around 60 seconds. Longer queues can extend this.

What do status codes mean when polling?

Status 0 = pending, 1 = processing, 2 = failed, 3 = completed. When completed you receive an MP4 videoUrl in the response.