home / skills / cnemri / google-genai-skills / veo-use

veo-use skill

safe

This skill helps you generate and edit videos using Google's Veo models by guiding prompts, masks, and durations with explicit defaults.

npx playbooks add skill cnemri/google-genai-skills --skill veo-use

Review the files below or copy the command above to add this skill to your agents.

Files (7)

SKILL.md

2.4 KB

---
name: veo-use
description: "Create and edit videos using Google's Veo 2 and Veo 3 models. Supports Text-to-Video, Image-to-Video, Reference-to-Video, Inpainting, and Video Extension. Available parameters: prompt, image, mask, mode, duration, aspect-ratio. Always confirm parameters with the user or explicitly state defaults before running."
---

# Veo Use

Use this skill to generate and edit videos using Google's Veo models (`veo-3.1` and `veo-2.0`).

This skill uses portable Python scripts managed by `uv`.

## Prerequisites

Ensure you have one of the following authentication methods configured in your environment:

1.  **API Key**:
    -   `GOOGLE_API_KEY` or `GEMINI_API_KEY`

2.  **Vertex AI**:
    -   `GOOGLE_CLOUD_PROJECT`
    -   `GOOGLE_CLOUD_LOCATION`
    -   `GOOGLE_GENAI_USE_VERTEXAI=1`

## Usage

### 1. Text to Video

Generate a video purely from a text description.

```bash
uv run skills/veo-use/scripts/text_to_video.py "A cinematic drone shot of a futuristic city" --output city.mp4
```

### 2. Image to Video

Generate a video starting from a static image context.

```bash
uv run skills/veo-use/scripts/image_to_video.py "Zoom out from the flower" --image start.png --output flower.mp4
```

### 3. Reference to Video

Use specific asset images (subjects, products) to guide generation.

```bash
uv run skills/veo-use/scripts/reference_to_video.py "A man walking on the moon" --reference-image man.png --output moon_walk.mp4
```

### 4. Edit Video (Inpainting)

Modify existing videos using masks.

**Modes:**
-   `REMOVE`: Remove dynamic object.
-   `REMOVE_STATIC`: Remove static object (watermark).
-   `INSERT`: Insert new object (requires `--prompt`).

```bash
uv run skills/veo-use/scripts/edit_video.py --video input.mp4 --mask mask.png --mode INSERT --prompt "A flying car" --output edited.mp4
```

### 5. Extend Video

Extend the duration of an existing video clip.

```bash
uv run skills/veo-use/scripts/extend_video.py --video clip.mp4 --prompt "The car flies away into the sunset" --duration 6 --output extended.mp4
```

### Common Options

-   `--model`: Default `veo-3.1-generate-001`.
-   `--resolution`: `1080p` (default), `720p`, `4k`.
-   `--aspect-ratio`: `16:9` (default), `9:16`.
-   `--duration`: `6` (default), `4`, `8`.

## References

> **Before running scripts**, review the reference guides for prompting tips and best practices.

-   [Prompting Guide](references/prompting.md) - Camera angles, movements, lens effects, and visual styles

Overview

This skill creates and edits videos using Google's Veo models (veo-3.1 and veo-2.0). It supports Text-to-Video, Image-to-Video, Reference-to-Video, Inpainting, and Video Extension workflows. Always confirm parameters with the user or explicitly state defaults before running.

How this skill works

The skill runs portable Python scripts that call Veo models to generate or modify video assets. You supply a prompt plus optional inputs like an image, mask, reference images, mode, duration, and aspect ratio. The tool validates authentication environment variables and uses sensible defaults if you do not provide explicit parameters.

When to use it

Generate a short cinematic clip from a text description (Text-to-Video).
Animate a static image into a motion sequence (Image-to-Video).
Create a video that preserves appearance of provided assets (Reference-to-Video).
Remove, replace, or insert objects in an existing clip using masks (Inpainting).
Extend the length of a clip with new content that flows from the original (Video Extension).

Best practices

Always confirm or explicitly state model, duration, resolution, aspect-ratio, and mode before running.
Provide clear, camera-aware prompts (camera angle, movement, lens, lighting) for predictable results.
Use high-quality reference images and clean masks for accurate edits and inpainting.
Start with default safe values (veo-3.1-generate-001, 1080p, 16:9, duration 6s) and iterate.
Ensure environment auth is configured (API key or Vertex AI variables) before generating.

Example use cases

Create a 6-second promo clip from a product description using Reference-to-Video.
Turn a portrait photo into a short reveal animation with Image-to-Video.
Remove a distracting object from a clip by supplying a mask and using REMOVE mode.
Insert a new element (INSERT mode) like a flying drone into an existing scene.
Extend a 4-second scene into a 10-second continuation with a descriptive prompt.

FAQ

What authentication is required?

Set GOOGLE_API_KEY or GEMINI_API_KEY for API key auth, or configure Vertex AI variables (GOOGLE_CLOUD_PROJECT, GOOGLE_CLOUD_LOCATION, GOOGLE_GENAI_USE_VERTEXAI=1).

Do I need to specify all parameters every run?

No. The skill uses defaults (model veo-3.1-generate-001, 1080p, 16:9, duration 6s) but you must confirm or override them before execution.