home / skills / openclaw / openclaw / openai-image-gen

openai-image-gen skill

/skills/openai-image-gen

This skill batch-generates diverse prompts and renders images via OpenAI Images API, delivering an interactive gallery of results.

This is most likely a fork of the openai-image-gen skill from openclaw
npx playbooks add skill openclaw/openclaw --skill openai-image-gen

Review the files below or copy the command above to add this skill to your agents.

Files (2)
SKILL.md
3.1 KB
---
name: openai-image-gen
description: Batch-generate images via OpenAI Images API. Random prompt sampler + `index.html` gallery.
homepage: https://platform.openai.com/docs/api-reference/images
metadata:
  {
    "openclaw":
      {
        "emoji": "🖼️",
        "requires": { "bins": ["python3"], "env": ["OPENAI_API_KEY"] },
        "primaryEnv": "OPENAI_API_KEY",
        "install":
          [
            {
              "id": "python-brew",
              "kind": "brew",
              "formula": "python",
              "bins": ["python3"],
              "label": "Install Python (brew)",
            },
          ],
      },
  }
---

# OpenAI Image Gen

Generate a handful of “random but structured” prompts and render them via the OpenAI Images API.

## Run

```bash
python3 {baseDir}/scripts/gen.py
open ~/Projects/tmp/openai-image-gen-*/index.html  # if ~/Projects/tmp exists; else ./tmp/...
```

Useful flags:

```bash
# GPT image models with various options
python3 {baseDir}/scripts/gen.py --count 16 --model gpt-image-1
python3 {baseDir}/scripts/gen.py --prompt "ultra-detailed studio photo of a lobster astronaut" --count 4
python3 {baseDir}/scripts/gen.py --size 1536x1024 --quality high --out-dir ./out/images
python3 {baseDir}/scripts/gen.py --model gpt-image-1.5 --background transparent --output-format webp

# DALL-E 3 (note: count is automatically limited to 1)
python3 {baseDir}/scripts/gen.py --model dall-e-3 --quality hd --size 1792x1024 --style vivid
python3 {baseDir}/scripts/gen.py --model dall-e-3 --style natural --prompt "serene mountain landscape"

# DALL-E 2
python3 {baseDir}/scripts/gen.py --model dall-e-2 --size 512x512 --count 4
```

## Model-Specific Parameters

Different models support different parameter values. The script automatically selects appropriate defaults based on the model.

### Size

- **GPT image models** (`gpt-image-1`, `gpt-image-1-mini`, `gpt-image-1.5`): `1024x1024`, `1536x1024` (landscape), `1024x1536` (portrait), or `auto`
  - Default: `1024x1024`
- **dall-e-3**: `1024x1024`, `1792x1024`, or `1024x1792`
  - Default: `1024x1024`
- **dall-e-2**: `256x256`, `512x512`, or `1024x1024`
  - Default: `1024x1024`

### Quality

- **GPT image models**: `auto`, `high`, `medium`, or `low`
  - Default: `high`
- **dall-e-3**: `hd` or `standard`
  - Default: `standard`
- **dall-e-2**: `standard` only
  - Default: `standard`

### Other Notable Differences

- **dall-e-3** only supports generating 1 image at a time (`n=1`). The script automatically limits count to 1 when using this model.
- **GPT image models** support additional parameters:
  - `--background`: `transparent`, `opaque`, or `auto` (default)
  - `--output-format`: `png` (default), `jpeg`, or `webp`
  - Note: `stream` and `moderation` are available via API but not yet implemented in this script
- **dall-e-3** has a `--style` parameter: `vivid` (hyper-real, dramatic) or `natural` (more natural looking)

## Output

- `*.png`, `*.jpeg`, or `*.webp` images (output format depends on model + `--output-format`)
- `prompts.json` (prompt → file mapping)
- `index.html` (thumbnail gallery)

Overview

This skill batch-generates images using the OpenAI Images API and produces a browsable index.html gallery. I provide a random-but-structured prompt sampler, model-appropriate defaults, and options to export images in common formats. It’s designed for fast exploration and iteration across different image models and parameters.

How this skill works

The script samples or accepts user prompts, then calls the selected OpenAI image model to render images in batches. Outputs are written to an output folder as PNG/JPEG/WEBP, and a prompts.json maps prompts to files. An index.html thumbnail gallery is generated so you can preview results locally in a browser.

When to use it

  • Rapidly prototype visual concepts from short prompts
  • Create a diverse set of images for moodboards or iteration
  • Compare output across OpenAI image models and parameter sets
  • Export many variations for testing in ML pipelines or design reviews
  • Generate assets for demos, presentations, or creative exploration

Best practices

  • Pick model-aware sizes and quality settings (defaults are auto-selected per model) to avoid wasted credits
  • Limit count for models that only support single-image generation (dall-e-3 is restricted to n=1)
  • Seed prompts with structure (style, camera, lighting, subject) to get consistent variations
  • Use smaller batches to validate parameters before large runs to save time and cost
  • Store prompts.json alongside outputs to keep reproducibility and attribution clear

Example use cases

  • Generate 16 stylistically consistent concept images for a character design session
  • Produce transparent-background assets using GPT image models for compositing
  • Create a 4-up comparison of gpt-image-1 vs gpt-image-1.5 at different sizes
  • Export a set of HD scenic images with dall-e-3 (single image per prompt) for a presentation slide
  • Build a quick thumbnail gallery of exploration results to share with stakeholders

FAQ

Which models can I use and how do defaults change?

The script supports GPT image models, dall-e-3, and dall-e-2. It auto-selects sensible defaults for size and quality based on the chosen model; you can override via flags.

How do I control output format and background transparency?

GPT image models support --output-format (png/jpeg/webp) and --background (transparent/opaque/auto). Other models have more limited format or background options.