home / skills / nikiforovall / claude-code-rules / nano-banana

This skill enables rapid Python scripting and Gemini image generation with uv, guiding one-off image tasks, iterative workflows, and self-contained scripts.

npx playbooks add skill nikiforovall/claude-code-rules --skill nano-banana

Review the files below or copy the command above to add this skill to your agents.

Files (2)
SKILL.md
5.5 KB
---
name: nano-banana
description: This skill should be used for Python scripting and Gemini image generation. Use when users ask to generate images, create AI art, edit images with AI, or run Python scripts with uv. Trigger phrases include "generate an image", "create a picture", "draw", "make an image of", "nano banana", or any image generation request.
---

# Nano Banana Skill

Python scripting with Gemini image generation using uv. Write small, focused scripts using heredocs for quick tasks—no files needed for one-off operations.

## Choosing Your Approach

**Quick image generation**: Use heredoc with inline Python for one-off image requests.

**Complex workflows**: When multiple steps are needed (generate -> refine -> save), break into separate scripts and iterate.

**Scripting tasks**: For non-image Python tasks, use the same heredoc pattern with `uv run`.

## Writing Scripts

Execute Python inline using heredocs with inline script metadata for dependencies:

```bash
uv run - << 'EOF'
# /// script
# dependencies = ["google-genai", "pillow"]
# ///
from google import genai
from google.genai import types

client = genai.Client()

response = client.models.generate_content(
    model="gemini-2.5-flash-image",
    contents=["A cute banana character with sunglasses"],
    config=types.GenerateContentConfig(
        response_modalities=['IMAGE']
    )
)

for part in response.parts:
    if part.inline_data is not None:
        image = part.as_image()
        image.save("tmp/generated.png")
        print("Image saved to tmp/generated.png")
EOF
```

The `# /// script` block declares dependencies inline using TOML syntax. This makes scripts self-contained and reproducible.

**Why these dependencies:**
- `google-genai` - Gemini API client
- `pillow` - Required for `.as_image()` method (converts base64 to PIL Image) and saving images

**Only write to files when:**
- The script needs to be reused multiple times
- The script is complex and requires iteration
- The user explicitly asks for a saved script

### Basic Template

```bash
uv run - << 'EOF'
# /// script
# dependencies = ["google-genai", "pillow"]
# ///
from google import genai
from google.genai import types

client = genai.Client()

# Generate image
response = client.models.generate_content(
    model="gemini-2.5-flash-image",
    contents=["YOUR PROMPT HERE"],
    config=types.GenerateContentConfig(
        response_modalities=['IMAGE']
    )
)

# Save result
for part in response.parts:
    if part.text is not None:
        print(part.text)
    elif part.inline_data is not None:
        image = part.as_image()
        image.save("tmp/output.png")
        print("Saved: tmp/output.png")
EOF
```

## Key Principles

1. **Small scripts**: Each script should do ONE thing (generate, refine, save)
2. **Evaluate output**: Always save images and print status to decide next steps
3. **Use tmp/**: Save generated images to tmp/ directory by default
4. **Stateless execution**: Each script runs independently, no cleanup needed

## Workflow Loop

Follow this pattern for complex tasks:

1. **Write a script** to generate/process one image
2. **Run it** and observe the output
3. **Evaluate** - did it work? Check the saved image
4. **Decide** - refine prompt or task complete?
5. **Repeat** until satisfied

## Image Configuration

Configure aspect ratio and resolution:

```python
config=types.GenerateContentConfig(
    response_modalities=['IMAGE'],
    image_config=types.ImageConfig(
        aspect_ratio="16:9",  # "1:1", "16:9", "9:16", "4:3", "3:4"
        image_size="2K"       # "1K", "2K", "4K" (uppercase required)
    )
)
```

## Models

- `gemini-2.5-flash-image` - Fast, general purpose image generation
- `gemini-3-pro-image-preview` - Advanced, professional asset production (Nano Banana Pro)

**Default to `gemini-3-pro-image-preview` (Nano Banana Pro)** for all image generation unless:
- The user explicitly requests a different model
- The user wants to save budget/costs
- The user specifies a simpler or quick generation task

Nano Banana Pro provides higher quality results and should be the recommended choice.

## Text + Image Output

To receive both text explanation and image:

```python
config=types.GenerateContentConfig(
    response_modalities=['TEXT', 'IMAGE']
)
```

## Image Editing

Edit existing images by including them in the request:

```bash
uv run - << 'EOF'
# /// script
# dependencies = ["google-genai", "pillow"]
# ///
from google import genai
from google.genai import types
from PIL import Image

client = genai.Client()

# Load existing image
img = Image.open("input.png")

response = client.models.generate_content(
    model="gemini-2.5-flash-image",
    contents=[
        "Add a party hat to this character",
        img
    ],
    config=types.GenerateContentConfig(
        response_modalities=['IMAGE']
    )
)

for part in response.parts:
    if part.inline_data is not None:
        part.as_image().save("tmp/edited.png")
        print("Saved: tmp/edited.png")
EOF
```

## Debugging Tips

1. **Print response.parts** to see what was returned
2. **Check for text parts** - model may include explanations
3. **Save images immediately** to verify output visually
4. **Use Read tool** to view saved images after generation

## Error Recovery

If a script fails:
1. Check error message for API issues
2. Verify GOOGLE_API_KEY is set
3. Try simpler prompt to isolate the issue
4. Check image format compatibility for edits

## Advanced Scenarios

For complex workflows including thinking process, Google Search grounding, multi-turn conversations, and professional asset production, load `references/guide.md`.

Overview

This skill provides a compact Python scripting workflow for generating and editing images with Gemini models using uv. It emphasizes single-purpose heredoc scripts that run inline, declare dependencies, and save results to tmp/ for quick iteration. The default recommendation is to use the higher-quality Gemini Pro image model unless the user requests otherwise.

How this skill works

Write small Python snippets inside a heredoc and execute them with uv run, declaring dependencies in an inline metadata block. The scripts call the Google GenAI client to request image generation, configure image parameters (aspect ratio, size), and convert inline_data to PIL Images for saving. Use the same pattern to edit existing images by including the image object in the request.

When to use it

  • Generate a new image, illustration, or AI art from a text prompt
  • Edit or augment an existing image (add objects, change style, retouch)
  • Run one-off Python image tasks without creating permanent files
  • Prototype image generation workflows before making a reusable script
  • Request both image and explanatory text in a single call

Best practices

  • Keep each script focused on one task: generate, refine, or save
  • Declare dependencies (google-genai, pillow) in the heredoc metadata
  • Save outputs to tmp/ and print status for easy visual verification
  • Default to gemini-3-pro-image-preview for quality, switch for cost or speed
  • Iterate: run, inspect saved image, refine prompt, repeat

Example use cases

  • Quickly generate a character illustration using an inline prompt and save to tmp/generated.png
  • Add a hat or other element to an existing photo by loading the image and including it in the request
  • Produce both an image and a short caption by requesting TEXT and IMAGE modalities
  • Prototype a multi-step pipeline: generate draft, refine style, then export final asset
  • Run a small image-processing utility (resize, format convert) via a one-off uv heredoc script

FAQ

Which model should I use by default?

Use gemini-3-pro-image-preview (Nano Banana Pro) for the best quality. Switch to gemini-2.5-flash-image for faster, lower-cost runs.

Do I need to write files to work with images?

No. For quick tasks, keep everything in-memory and save only to tmp/ for inspection. Write files only when you need reuse or iteration.