home / skills / artwist-polyakov / polyakov-claude-skills / fal-ai-image

fal-ai-image skill

safe

/plugins/fal-ai-image/skills/fal-ai-image

This skill generates or edits images with fal.ai nano-banana-pro, delivering native text rendering and complex compositions for infographics.

npx playbooks add skill artwist-polyakov/polyakov-claude-skills --skill fal-ai-image

Review the files below or copy the command above to add this skill to your agents.

Files (7)

SKILL.md

5.2 KB

---
name: fal-ai-image
description: "Generate/edit images via fal.ai nano-banana-pro. Supports reference images (Edit mode) and text rendering in any language including Cyrillic. ALWAYS read SKILL.md before first use."
---

# fal-ai-image

Generate images via fal.ai nano-banana-pro (Gemini 3 Pro Image).
Best for: infographics, text rendering, complex compositions.

## STOP — Read Before Acting

- **DO NOT** use Pillow, ImageMagick, or any post-processing for text/logo overlay — this model renders text natively, including Cyrillic and CJK
- **DO NOT** use Generate mode when user provides reference images — use **Edit mode**
- **DO NOT** launch general-purpose subagents — use `Task` with `subagent_type: "Bash"` and `model: "haiku"`
- **DO NOT** skip uploading local files — run `upload.sh` first to get URLs for `edit.sh`
- **DO NOT** guess script parameters — check the tables below

## Quick Start Decision

```
User gave reference images?  → Edit mode  (upload.sh → edit.sh)
User wants text-only gen?    → Generate mode (generate.sh)
Multiple images needed?      → Parallel Bash/haiku subagents
```

## Model Capabilities

- Excellent text rendering (Latin, Cyrillic, CJK) — no post-processing needed
- Composes logos, product photos, and text into banners in a single pass
- Understands layout instructions ("left side text, right side product photo")
- Handles complex infographics, charts, and diagrams
- Edit mode blends reference images naturally with prompt guidance

## Compatibility

Scripts are POSIX sh compatible — work in cloud sandboxes (`/bin/sh`) and locally (`bash`).
No bashisms: `[[ ]]`, `${BASH_SOURCE}`, `source` etc. are NOT used.

## Config

Requires `FAL_KEY` in `config/.env` or environment.
Get key: https://fal.ai/dashboard/keys

## Two Modes

### 1. Generate (text-to-image)
Create images from text prompt only.
Script: `scripts/generate.sh`

### 2. Edit (image-to-image)
Create images using reference images (up to 14).
Script: `scripts/edit.sh`

## Workflow

**IMPORTANT**: Run generation via Task tool with Haiku subagent to avoid blocking main context.

### For Generate mode:

1. **Clarify params** (if not specified):
   - Aspect ratio: `1:1`, `16:9`, `9:16`, `4:3`, etc.
   - Resolution: `1K` (default), `2K`, `4K`

2. **Propose save path** based on project structure:
   - Check for `./images/`, `./assets/`, `./static/`
   - Suggest: "Save to `./images/infographic_coffee.png`?"

3. **Show price & confirm**:
   ```
   Cost: $0.15/image (4K: $0.30)
   Confirm? (yes/no)
   ```

4. **Launch subagent**:
   ```
   Task tool:
   - subagent_type: "Bash"
   - model: "haiku"
   - run_in_background: true
   - prompt: Run generate.sh with params
   ```

5. **Report result**: Parse JSON output, show image URL, read saved file.

### For Edit mode:

1. **Get reference images**:
   - If URLs provided → use directly
   - If local files → run `upload.sh` first to get URLs

2. **Clarify prompt**: What to do with references?

3. **Show price & confirm**: Same as generate

4. **Launch subagent** with `edit.sh`

5. **Report result**

## Scripts

### generate.sh
```bash
sh scripts/generate.sh \
  --prompt "infographic about coffee brewing" \
  --aspect-ratio "9:16" \
  --resolution "1K" \
  --output-dir "./images" \
  --filename "coffee_infographic"
```

| Param | Required | Default | Values |
|-------|----------|---------|--------|
| `--prompt` | yes | - | text |
| `--aspect-ratio` | no | 1:1 | 21:9, 16:9, 3:2, 4:3, 5:4, 1:1, 4:5, 3:4, 2:3, 9:16 |
| `--resolution` | no | 1K | 1K, 2K, 4K |
| `--num-images` | no | 1 | 1-4 |
| `--output-dir` | no | - | path |
| `--filename` | no | generated | base name |
| `--web-search` | no | false | flag |

### edit.sh
```bash
sh scripts/edit.sh \
  --prompt "combine these into a collage" \
  --image-urls "https://example.com/img1.png,https://example.com/img2.png" \
  --aspect-ratio "16:9" \
  --output-dir "./images" \
  --filename "collage"
```

| Param | Required | Default | Values |
|-------|----------|---------|--------|
| `--prompt` | yes | - | text |
| `--image-urls` | yes | - | comma-separated URLs (max 14) |
| `--aspect-ratio` | no | auto | auto, 1:1, 16:9, etc. |
| `--resolution` | no | 1K | 1K, 2K, 4K |
| `--num-images` | no | 1 | 1-4 variations |
| `--output-dir` | no | - | path |
| `--filename` | no | edited | base name |

### upload.sh (for local files)
```bash
# Get URL for local file
URL=$(sh scripts/upload.sh --file /path/to/image.png)

# Or get base64 data URI (for small files)
URI=$(sh scripts/upload.sh --file /path/to/image.png --base64)
```

## Parallel Generation

For multiple images — launch several subagents in parallel:

```
Task 1: generate "cat in space" → cat_space.png
Task 2: generate "dog on moon" → dog_moon.png
Task 3: generate "bird in ocean" → bird_ocean.png
```

Each runs independently via Haiku, results collected when done.

## Pricing

- Generate: **$0.15**/image
- Edit: **$0.15**/edit
- 4K resolution: **$0.30** (2x)
- Web search: **+$0.015**

Formula: `price = num_images * (resolution == "4K" ? 0.30 : 0.15) + (web_search ? 0.015 : 0)`

## Notes

- URLs expire in ~1 hour — save locally if needed
- Uploaded files stored 7 days on fal.ai, then auto-deleted
- Model excels at text rendering and infographics

Overview

This skill generates and edits images using fal.ai nano-banana-pro (Gemini 3 Pro Image). It excels at infographics, precise text rendering (Latin, Cyrillic, CJK), and complex compositions that combine photos, logos, and layout instructions. The workflow uses POSIX shell scripts for Generate and Edit modes and launches tasks via a Bash subagent.

How this skill works

Use Generate mode for text-only image creation and Edit mode when the user provides reference images. Local files must be uploaded (upload.sh) to obtain URLs used by edit.sh. Tasks are executed through the Task tool with a Bash subagent (model: "haiku") to run the scripts and return JSON results containing image URLs and save paths.

When to use it

Create text-heavy banners, posters, or multilingual infographics (no overlay needed).
Edit or collage multiple reference images into a coherent composition (use Edit mode).
Produce product shots or marketing assets that combine photos and typographic elements.
Generate multiple images in parallel for A/B testing or batch production.
Save outputs locally when you need permanent assets—fal.ai URLs expire quickly.

Best practices

Always choose Edit mode when any reference images are provided; do not pass references to Generate.
Upload local files with scripts/upload.sh before calling edit.sh to get stable URLs.
Clarify aspect ratio and resolution up front (common options: 1:1, 16:9, 9:16; resolutions: 1K, 2K, 4K).
Run scripts via Task with subagent_type: "Bash" and model: "haiku" to avoid blocking the main context.
Do not perform external post-processing for text or logos—the model renders text natively, including Cyrillic and CJK.

Example use cases

Generate a 9:16 multilingual infographic about coffee brewing, saved to ./images/coffee_infographic.png.
Edit a set of product photos into a 16:9 marketing banner by uploading local images and running edit.sh with a collage prompt.
Batch-generate four variant social media thumbnails in parallel using multiple Bash/haiku subagents.
Produce a 4K poster with embedded Cyrillic copy and validate that no external text overlay is required.

FAQ

Can the model render non-Latin scripts like Cyrillic or Chinese?

Yes. The model renders Latin, Cyrillic, and CJK text natively—no external text overlay tools are needed.

When should I use upload.sh?

Run upload.sh whenever you have local image files. It returns URLs or base64 URIs required by edit.sh; passing local paths directly will not work.

How do I run tasks without blocking?

Launch the scripts via the Task tool configured with subagent_type: "Bash", model: "haiku", and run_in_background:true. That runs generation/edit processes asynchronously and returns JSON results.

How is pricing calculated?

Base cost is $0.15 per image/edit. 4K outputs cost $0.30 per image. Enabling web search adds $0.015. Total = num_images * (4K? $0.30 : $0.15) + (web_search? $0.015 : $0).