home / skills / hoodini / ai-agents-skills / nano-banana-pro

nano-banana-pro skill

safe

This skill helps you generate professional images with Gemini 3 Pro Image API, enabling high quality visuals and quick iterative design.

npx playbooks add skill hoodini/ai-agents-skills --skill nano-banana-pro

Review the files below or copy the command above to add this skill to your agents.

Files (1)

SKILL.md

7.0 KB

---
name: nano-banana-pro
description: Generate images with Google's Nano Banana Pro (Gemini 3 Pro Image). Use when generating AI images via Gemini API, creating professional visuals, or building image generation features. Triggers on Nano Banana Pro, Gemini 3 Pro Image, gemini-3-pro-image-preview, Google image generation.
---

# Nano Banana Pro (Gemini 3 Pro Image)

Generate high-quality images with Google's Gemini 3 Pro Image API.

## Overview

**Nano Banana Pro** is the marketing name for **Gemini 3 Pro Image** (`gemini-3-pro-image-preview`), Google's state-of-the-art image generation and editing model built on Gemini 3 Pro.

## Quick Start

### Get API Key
1. Go to [Google AI Studio](https://aistudio.google.com)
2. Click "Get API Key"
3. Store securely as environment variable

### Basic Image Generation (Python)
```python
from google import genai
from google.genai import types

client = genai.Client(api_key="YOUR_GEMINI_API_KEY")

response = client.models.generate_content(
    model="gemini-3-pro-image-preview",
    contents="A serene Japanese garden with cherry blossoms and a koi pond",
    config=types.GenerateContentConfig(
        response_modalities=['TEXT', 'IMAGE']
    )
)

# Process response
for part in response.candidates[0].content.parts:
    if hasattr(part, 'text'):
        print(f"Description: {part.text}")
    elif hasattr(part, 'inline_data'):
        # Save image
        image_data = part.inline_data.data  # Base64 encoded
        mime_type = part.inline_data.mime_type  # image/png
        
        import base64
        with open("output.png", "wb") as f:
            f.write(base64.b64decode(image_data))
```

### REST API (cURL)
```bash
curl -s -X POST \
  "https://generativelanguage.googleapis.com/v1beta/models/gemini-3-pro-image-preview:generateContent" \
  -H "x-goog-api-key: $GEMINI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [{
      "role": "user",
      "parts": [{"text": "Create a vibrant infographic about photosynthesis"}]
    }],
    "generationConfig": {
      "responseModalities": ["TEXT", "IMAGE"]
    }
  }'
```

### TypeScript/JavaScript
```typescript
const GEMINI_API_KEY = process.env.GEMINI_API_KEY;

async function generateImage(prompt: string) {
  const response = await fetch(
    'https://generativelanguage.googleapis.com/v1beta/models/gemini-3-pro-image-preview:generateContent',
    {
      method: 'POST',
      headers: {
        'x-goog-api-key': GEMINI_API_KEY!,
        'Content-Type': 'application/json',
      },
      body: JSON.stringify({
        contents: [{ 
          role: 'user', 
          parts: [{ text: prompt }] 
        }],
        generationConfig: {
          responseModalities: ['TEXT', 'IMAGE'],
        },
      }),
    }
  );

  const data = await response.json();
  return data;
}
```

## Configuration Options

### Image Configuration
```python
response = client.models.generate_content(
    model="gemini-3-pro-image-preview",
    contents="Professional product photo of a coffee mug",
    config=types.GenerateContentConfig(
        response_modalities=['TEXT', 'IMAGE'],
        image_config=types.ImageConfig(
            aspect_ratio="16:9",  # Options: 1:1, 3:2, 16:9, 9:16, 21:9
            image_size="2K"       # Options: 1K, 2K, 4K
        )
    )
)
```

### With Google Search Grounding
```python
response = client.models.generate_content(
    model="gemini-3-pro-image-preview",
    contents="Create an infographic showing today's stock market trends",
    config=types.GenerateContentConfig(
        response_modalities=['TEXT', 'IMAGE'],
        tools=[{"google_search": {}}]  # Enable search grounding
    )
)
```

## Multi-Turn Conversations (Iterative Editing)

```python
# Create a chat session
chat = client.chats.create(
    model="gemini-3-pro-image-preview",
    config=types.GenerateContentConfig(
        response_modalities=['TEXT', 'IMAGE'],
        tools=[{"google_search": {}}]
    )
)

# Initial generation
response1 = chat.send_message(
    "Create a vibrant infographic explaining photosynthesis"
)

# Edit the image
response2 = chat.send_message(
    "Update this infographic to be in Spanish. Keep all other elements the same."
)
```

## Key Capabilities

### 1. Superior Text Rendering
```python
response = client.models.generate_content(
    model="gemini-3-pro-image-preview",
    contents="""Create a professional poster with:
    - Title: "Annual Tech Summit 2025"
    - Date: March 15-17, 2025
    - Location: San Francisco Convention Center
    """,
    config=types.GenerateContentConfig(
        response_modalities=['TEXT', 'IMAGE']
    )
)
```

### 2. Character Consistency (Up to 5 Subjects)
```python
import base64

def load_image(path: str) -> str:
    with open(path, "rb") as f:
        return base64.b64encode(f.read()).decode()

character_ref = load_image("character.png")

response = client.models.generate_content(
    model="gemini-3-pro-image-preview",
    contents=[
        {"text": "Generate an image of this person at a tech conference"},
        {"inline_data": {"mime_type": "image/png", "data": character_ref}}
    ],
    config=types.GenerateContentConfig(
        response_modalities=['TEXT', 'IMAGE']
    )
)
```

## Next.js API Route

```typescript
// app/api/generate-image/route.ts
import { NextRequest, NextResponse } from 'next/server';

export async function POST(request: NextRequest) {
  const { prompt, aspectRatio = '1:1', imageSize = '2K' } = await request.json();

  try {
    const response = await fetch(
      'https://generativelanguage.googleapis.com/v1beta/models/gemini-3-pro-image-preview:generateContent',
      {
        method: 'POST',
        headers: {
          'x-goog-api-key': process.env.GEMINI_API_KEY!,
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          contents: [{ role: 'user', parts: [{ text: prompt }] }],
          generationConfig: {
            responseModalities: ['TEXT', 'IMAGE'],
            imageConfig: { aspectRatio, imageSize },
          },
        }),
      }
    );

    const data = await response.json();
    const parts = data.candidates?.[0]?.content?.parts || [];
    const imagePart = parts.find((p: any) => p.inline_data);

    return NextResponse.json({
      image: imagePart ? {
        data: imagePart.inline_data.data,
        mimeType: imagePart.inline_data.mime_type,
        url: `data:${imagePart.inline_data.mime_type};base64,${imagePart.inline_data.data}`,
      } : null,
    });
  } catch (error) {
    return NextResponse.json({ error: 'Generation failed' }, { status: 500 });
  }
}
```

## Model Comparison

| Feature | Nano Banana (2.5 Flash) | Nano Banana Pro (3 Pro Image) |
|---------|-------------------------|-------------------------------|
| Model ID | gemini-2.5-flash-image | gemini-3-pro-image-preview |
| Quality | Good | Best |
| Speed | Faster | Slower |
| Cost | Lower | Higher |
| Best For | Previews, high-volume | Production, professional |

## Resources

- **Documentation**: https://ai.google.dev/gemini-api/docs/image-generation
- **Google AI Studio**: https://aistudio.google.com
- **Prompt Guide**: https://ai.google.dev/gemini-api/docs/prompting-intro

Overview

This skill generates high-quality images using Google's Nano Banana Pro (Gemini 3 Pro Image) model. It provides ready-to-use patterns and code examples for image creation, iterative editing, and production-grade visuals via the Gemini API. Use it to add professional image generation to apps, marketing workflows, or automated design pipelines.

How this skill works

The skill calls the Gemini 3 Pro Image endpoint to produce images and optional text outputs. It supports direct prompts, inline image references for character consistency, configurable aspect ratios and resolutions, and multi-turn chat sessions for iterative edits. Responses include base64 image data and metadata that you can save, preview, or return from server routes.

When to use it

Generating production-quality marketing visuals, posters, and infographics.
Creating consistent character or product shots using reference images.
Building image generation features in web or mobile apps (APIs or server routes).
Iteratively editing images via multi-turn conversations (e.g., refine language or composition).
Embedding text-rendered graphics like event posters or UI mockups.

Best practices

Store the Gemini API key securely as an environment variable and never hard-code it.
Request both TEXT and IMAGE modalities to receive captions and metadata alongside images.
Use image_config (aspect ratio, image_size) to match downstream display or print requirements.
Provide inline reference images to preserve character consistency across multiple outputs.
Use chat sessions for iterative edits instead of regenerating from scratch to retain layout context.

Example use cases

Next.js API route that returns a base64 data URL for client previews.
Automated generation of product photos with consistent lighting and framing for an ecommerce catalog.
Creating multilingual infographics by generating a base image and then editing text layers in follow-up messages.
Design prototyping: produce multiple aspect ratios (1:1, 16:9, 9:16) from one prompt for cross-platform assets.

FAQ

What response formats does the model return?

Responses can include TEXT and IMAGE parts; images are returned as base64 inline_data with mime_type (e.g., image/png).

How do I maintain character consistency across images?

Upload a reference image as inline_data and include it in the same generation call; the model can keep character consistency for up to five subjects.