home / skills / jst-well-dan / skill-box / baoyu-image-gen
This skill generates images using official OpenAI and Google APIs, supporting prompts, references, aspect ratios, and quality presets.
npx playbooks add skill jst-well-dan/skill-box --skill baoyu-image-genReview the files below or copy the command above to add this skill to your agents.
---
name: baoyu-image-gen
description: Generates images using official OpenAI and Google APIs via AI SDK. Supports text-to-image, reference images, aspect ratios, and quality presets. Use when user asks to "generate image with API", "use official API for images", "create image with OpenAI/Google", or needs API-based generation instead of browser-based.
---
# Image Generation (AI SDK)
Official API-based image generation. Supports OpenAI and Google providers.
## Script Directory
**Agent Execution**:
1. `SKILL_DIR` = this SKILL.md file's directory
2. Script path = `${SKILL_DIR}/scripts/main.ts`
## Preferences (EXTEND.md)
Use Bash to check EXTEND.md existence (priority order):
```bash
# Check project-level first
test -f .baoyu-skills/baoyu-image-gen/EXTEND.md && echo "project"
# Then user-level (cross-platform: $HOME works on macOS/Linux/WSL)
test -f "$HOME/.baoyu-skills/baoyu-image-gen/EXTEND.md" && echo "user"
```
┌──────────────────────────────────────────────────┬───────────────────┐
│ Path │ Location │
├──────────────────────────────────────────────────┼───────────────────┤
│ .baoyu-skills/baoyu-image-gen/EXTEND.md │ Project directory │
├──────────────────────────────────────────────────┼───────────────────┤
│ $HOME/.baoyu-skills/baoyu-image-gen/EXTEND.md │ User home │
└──────────────────────────────────────────────────┴───────────────────┘
┌───────────┬───────────────────────────────────────────────────────────────────────────┐
│ Result │ Action │
├───────────┼───────────────────────────────────────────────────────────────────────────┤
│ Found │ Read, parse, apply settings │
├───────────┼───────────────────────────────────────────────────────────────────────────┤
│ Not found │ Use defaults │
└───────────┴───────────────────────────────────────────────────────────────────────────┘
**EXTEND.md Supports**: Default provider | Default quality | Default aspect ratio
## Usage
```bash
# Basic
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image cat.png
# With aspect ratio
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A landscape" --image out.png --ar 16:9
# High quality
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --quality 2k
# From prompt files
npx -y bun ${SKILL_DIR}/scripts/main.ts --promptfiles system.md content.md --image out.png
# With reference images (Google multimodal only)
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "Make blue" --image out.png --ref source.png
# Specific provider
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --provider openai
```
## Options
| Option | Description |
|--------|-------------|
| `--prompt <text>`, `-p` | Prompt text |
| `--promptfiles <files...>` | Read prompt from files (concatenated) |
| `--image <path>` | Output image path (required) |
| `--provider google\|openai` | Force provider (default: google) |
| `--model <id>`, `-m` | Model ID |
| `--ar <ratio>` | Aspect ratio (e.g., `16:9`, `1:1`, `4:3`) |
| `--size <WxH>` | Size (e.g., `1024x1024`) |
| `--quality normal\|2k` | Quality preset (default: 2k) |
| `--imageSize 1K\|2K\|4K` | Image size for Google (default: from quality) |
| `--ref <files...>` | Reference images (Google multimodal only) |
| `--n <count>` | Number of images |
| `--json` | JSON output |
## Environment Variables
| Variable | Description |
|----------|-------------|
| `OPENAI_API_KEY` | OpenAI API key |
| `GOOGLE_API_KEY` | Google API key |
| `OPENAI_IMAGE_MODEL` | OpenAI model override |
| `GOOGLE_IMAGE_MODEL` | Google model override |
| `OPENAI_BASE_URL` | Custom OpenAI endpoint |
| `GOOGLE_BASE_URL` | Custom Google endpoint |
**Load Priority**: CLI args > env vars > `<cwd>/.baoyu-skills/.env` > `~/.baoyu-skills/.env`
## Provider Selection
1. `--provider` specified → use it
2. Only one API key available → use that provider
3. Both available → default to Google
## Quality Presets
| Preset | Google imageSize | OpenAI Size | Use Case |
|--------|------------------|-------------|----------|
| `normal` | 1K | 1024px | Quick previews |
| `2k` (default) | 2K | 2048px | Covers, illustrations, infographics |
**Google imageSize**: Can be overridden with `--imageSize 1K|2K|4K`
## Aspect Ratios
Supported: `1:1`, `16:9`, `9:16`, `4:3`, `3:4`, `2.35:1`
- Google multimodal: uses `imageConfig.aspectRatio`
- Google Imagen: uses `aspectRatio` parameter
- OpenAI: maps to closest supported size
## Error Handling
- Missing API key → error with setup instructions
- Generation failure → auto-retry once
- Invalid aspect ratio → warning, proceed with default
- Reference images with non-multimodal model → warning, ignore refs
## Extension Support
Custom configurations via EXTEND.md. See **Preferences** section for paths and supported options.
This skill generates images using official OpenAI and Google APIs via an AI SDK. It supports text-to-image generation, reference images (Google multimodal), selectable aspect ratios, and quality presets for quick previews or high-resolution outputs. Use it when you need programmatic, API-driven image creation rather than browser-based tools.
The script invokes Google or OpenAI image endpoints based on CLI flags, available API keys, or default settings. It accepts prompts (inline or from files), optional reference images for Google multimodal, aspect ratio or explicit size, quality presets, and saves one or more output files. The tool validates inputs, retries once on failures, and falls back to defaults when configurations are missing.
How does the skill choose between OpenAI and Google?
It uses the --provider flag if provided. If not, it picks the single available API key, or defaults to Google when both keys are present.
What happens if an API key is missing or a request fails?
Missing keys trigger a clear error with setup guidance. Generation failures auto-retry once before reporting an error.