home / skills / openclaw / skills / veo-video-generator

veo-video-generator skill

/skills/kghamilton89/veo-video-generator

This skill generates high-fidelity 1080p videos with synced audio from text prompts using Veo 3.1, delivering ready-to-use cinematic clips.

npx playbooks add skill openclaw/skills --skill veo-video-generator

Review the files below or copy the command above to add this skill to your agents.

Files (5)
SKILL.md
1.9 KB
---
name: veo-video-generator
description: Generates high-fidelity 1080p videos with synced audio using Google Veo 3.1. Use for creating cinematic clips from text descriptions.
metadata:
  clawdbot:
    emoji: "🎬"
    requires:
      env: ["GEMINI_API_KEY"]
      bins: ["node", "npm"]
    install: "npm install"
    primaryEnv: "GEMINI_API_KEY"
    category: "Video & Media"
---

# Veo Video Generator

Generates short video clips with native audio using Google's state-of-the-art Veo 3.1 model.

## Instructions

1. **Trigger**: Activate when the user wants to create or render a video.
2. **Setup**: The agent must run `npm install` once before the first execution to fetch dependencies.
3. **Execution**: Invoke the script by passing the user prompt as a **separate argument**, never by interpolating it into a shell command string. Use an argument array / `execFile`-style invocation so the shell never parses the prompt value. Example (Node-style pseudo-code):

   ```javascript
   execFile('node', ['generate.js', '--prompt', userPrompt])
   ```

   Do **not** construct the command as a single concatenated string such as `"node generate.js --prompt " + userPrompt`.

4. **Resolution**: Outputs 1080p video in 9:16 aspect ratio by default.
5. **Completion**: Provide the user with the filename of the generated .mp4 in the workspace.

## Security & Privacy

- **Shell Injection Prevention**: The user prompt **must** be passed as a discrete argument (e.g. via `execFile` or an argv array), never interpolated into a shell command string. Concatenating user input into a shell string (e.g. `shell: true` with template literals) enables shell injection and is strictly forbidden.
- **Instruction Scope**: This skill only sends text prompts to the Google GenAI API.
- **Environment**: It uses the `GEMINI_API_KEY` provided by the OpenClaw environment.
- **Data Access**: It does not read local files or .env files. All configuration is handled by the agent.

Overview

This skill generates high-fidelity 1080p cinematic video clips with synced native audio using Google Veo 3.1. It converts short text descriptions into polished 9:16 MP4 videos and returns the output filename for immediate use. Ideal for rapid prototyping of cinematic scenes or social-format clips from text prompts.

How this skill works

The agent accepts a user prompt and runs a node-based generator that calls the Veo 3.1 model to synthesize video and audio. Before the first run, it installs required dependencies via npm. Execution produces a 1080p MP4 in a 9:16 aspect ratio and the skill returns the generated filename in the workspace. Only the text prompt is sent to the external API; no local files or environment files are read.

When to use it

  • Create quick cinematic clips from a short text description.
  • Prototype social-format (9:16) video concepts for review.
  • Generate B-roll style footage with synced audio for editing.
  • Produce short marketing or demo clips when you need fast iterations.
  • Automate bulk generation of short video variants from different prompts.

Best practices

  • Craft concise, vivid prompts focusing on scene, mood, and key actions.
  • Include audio direction in the prompt if a specific sound or music style is required.
  • Run npm install once in the environment before generating videos to ensure dependencies are present.
  • Test with short prompts first to verify style and pacing, then refine for longer scenes.
  • Keep file management simple: collect filenames returned after execution and move or rename as needed.

Example use cases

  • Turn product descriptions into 15–30 second cinematic promo clips for social ads.
  • Generate atmospheric scene previews for a game or film concept pitch.
  • Produce multiple stylistic variations of a short scene to compare lighting and mood.
  • Create short narrated or music-backed clips for storyboarding or client review.

FAQ

What resolution and aspect ratio are produced?

Outputs are 1080p at a 9:16 aspect ratio by default and saved as MP4 files.

Do I need to provide any local files or environment variables?

No. The skill only sends text prompts to the API and uses the runtime-provided GEMINI_API_KEY; it does not read local files or .env files.