home / skills / openclaw / skills / veo-video-generator

veo-video-generator skill

safe

/skills/kghamilton89/veo-video-generator

This skill generates high-fidelity 1080p videos with synced audio from text prompts using Veo 3.1, delivering ready-to-use cinematic clips.

npx playbooks add skill openclaw/skills --skill veo-video-generator

Review the files below or copy the command above to add this skill to your agents.

Files (5)

SKILL.md

1.9 KB

---
name: veo-video-generator
description: Generates high-fidelity 1080p videos with synced audio using Google Veo 3.1. Use for creating cinematic clips from text descriptions.
metadata:
  clawdbot:
    emoji: "🎬"
    requires:
      env: ["GEMINI_API_KEY"]
      bins: ["node", "npm"]
    install: "npm install"
    primaryEnv: "GEMINI_API_KEY"
    category: "Video & Media"
---

# Veo Video Generator

Generates short video clips with native audio using Google's state-of-the-art Veo 3.1 model.

## Instructions

1. **Trigger**: Activate when the user wants to create or render a video.
2. **Setup**: The agent must run `npm install` once before the first execution to fetch dependencies.
3. **Execution**: Invoke the script by passing the user prompt as a **separate argument**, never by interpolating it into a shell command string. Use an argument array / `execFile`-style invocation so the shell never parses the prompt value. Example (Node-style pseudo-code):

   ```javascript
   execFile('node', ['generate.js', '--prompt', userPrompt])
   ```

   Do **not** construct the command as a single concatenated string such as `"node generate.js --prompt " + userPrompt`.

4. **Resolution**: Outputs 1080p video in 9:16 aspect ratio by default.
5. **Completion**: Provide the user with the filename of the generated .mp4 in the workspace.

## Security & Privacy

- **Shell Injection Prevention**: The user prompt **must** be passed as a discrete argument (e.g. via `execFile` or an argv array), never interpolated into a shell command string. Concatenating user input into a shell string (e.g. `shell: true` with template literals) enables shell injection and is strictly forbidden.
- **Instruction Scope**: This skill only sends text prompts to the Google GenAI API.
- **Environment**: It uses the `GEMINI_API_KEY` provided by the OpenClaw environment.
- **Data Access**: It does not read local files or .env files. All configuration is handled by the agent.

Overview

This skill generates high-fidelity 1080p cinematic video clips with synced native audio using Google Veo 3.1. It converts short text descriptions into polished 9:16 MP4 videos and returns the output filename for immediate use. Ideal for rapid prototyping of cinematic scenes or social-format clips from text prompts.

How this skill works

The agent accepts a user prompt and runs a node-based generator that calls the Veo 3.1 model to synthesize video and audio. Before the first run, it installs required dependencies via npm. Execution produces a 1080p MP4 in a 9:16 aspect ratio and the skill returns the generated filename in the workspace. Only the text prompt is sent to the external API; no local files or environment files are read.

When to use it

Create quick cinematic clips from a short text description.
Prototype social-format (9:16) video concepts for review.
Generate B-roll style footage with synced audio for editing.
Produce short marketing or demo clips when you need fast iterations.
Automate bulk generation of short video variants from different prompts.

Best practices

Craft concise, vivid prompts focusing on scene, mood, and key actions.
Include audio direction in the prompt if a specific sound or music style is required.
Run npm install once in the environment before generating videos to ensure dependencies are present.
Test with short prompts first to verify style and pacing, then refine for longer scenes.
Keep file management simple: collect filenames returned after execution and move or rename as needed.

Example use cases

Turn product descriptions into 15–30 second cinematic promo clips for social ads.
Generate atmospheric scene previews for a game or film concept pitch.
Produce multiple stylistic variations of a short scene to compare lighting and mood.
Create short narrated or music-backed clips for storyboarding or client review.

FAQ

What resolution and aspect ratio are produced?

Outputs are 1080p at a 9:16 aspect ratio by default and saved as MP4 files.

Do I need to provide any local files or environment variables?

No. The skill only sends text prompts to the API and uses the runtime-provided GEMINI_API_KEY; it does not read local files or .env files.