home / skills / hoodini / ai-agents-skills / nano-banana-pro

nano-banana-pro skill

/skills/nano-banana-pro

This skill helps you generate professional images with Gemini 3 Pro Image API, enabling high quality visuals and quick iterative design.

npx playbooks add skill hoodini/ai-agents-skills --skill nano-banana-pro

Review the files below or copy the command above to add this skill to your agents.

Files (1)
SKILL.md
7.0 KB
---
name: nano-banana-pro
description: Generate images with Google's Nano Banana Pro (Gemini 3 Pro Image). Use when generating AI images via Gemini API, creating professional visuals, or building image generation features. Triggers on Nano Banana Pro, Gemini 3 Pro Image, gemini-3-pro-image-preview, Google image generation.
---

# Nano Banana Pro (Gemini 3 Pro Image)

Generate high-quality images with Google's Gemini 3 Pro Image API.

## Overview

**Nano Banana Pro** is the marketing name for **Gemini 3 Pro Image** (`gemini-3-pro-image-preview`), Google's state-of-the-art image generation and editing model built on Gemini 3 Pro.

## Quick Start

### Get API Key
1. Go to [Google AI Studio](https://aistudio.google.com)
2. Click "Get API Key"
3. Store securely as environment variable

### Basic Image Generation (Python)
```python
from google import genai
from google.genai import types

client = genai.Client(api_key="YOUR_GEMINI_API_KEY")

response = client.models.generate_content(
    model="gemini-3-pro-image-preview",
    contents="A serene Japanese garden with cherry blossoms and a koi pond",
    config=types.GenerateContentConfig(
        response_modalities=['TEXT', 'IMAGE']
    )
)

# Process response
for part in response.candidates[0].content.parts:
    if hasattr(part, 'text'):
        print(f"Description: {part.text}")
    elif hasattr(part, 'inline_data'):
        # Save image
        image_data = part.inline_data.data  # Base64 encoded
        mime_type = part.inline_data.mime_type  # image/png
        
        import base64
        with open("output.png", "wb") as f:
            f.write(base64.b64decode(image_data))
```

### REST API (cURL)
```bash
curl -s -X POST \
  "https://generativelanguage.googleapis.com/v1beta/models/gemini-3-pro-image-preview:generateContent" \
  -H "x-goog-api-key: $GEMINI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [{
      "role": "user",
      "parts": [{"text": "Create a vibrant infographic about photosynthesis"}]
    }],
    "generationConfig": {
      "responseModalities": ["TEXT", "IMAGE"]
    }
  }'
```

### TypeScript/JavaScript
```typescript
const GEMINI_API_KEY = process.env.GEMINI_API_KEY;

async function generateImage(prompt: string) {
  const response = await fetch(
    'https://generativelanguage.googleapis.com/v1beta/models/gemini-3-pro-image-preview:generateContent',
    {
      method: 'POST',
      headers: {
        'x-goog-api-key': GEMINI_API_KEY!,
        'Content-Type': 'application/json',
      },
      body: JSON.stringify({
        contents: [{ 
          role: 'user', 
          parts: [{ text: prompt }] 
        }],
        generationConfig: {
          responseModalities: ['TEXT', 'IMAGE'],
        },
      }),
    }
  );

  const data = await response.json();
  return data;
}
```

## Configuration Options

### Image Configuration
```python
response = client.models.generate_content(
    model="gemini-3-pro-image-preview",
    contents="Professional product photo of a coffee mug",
    config=types.GenerateContentConfig(
        response_modalities=['TEXT', 'IMAGE'],
        image_config=types.ImageConfig(
            aspect_ratio="16:9",  # Options: 1:1, 3:2, 16:9, 9:16, 21:9
            image_size="2K"       # Options: 1K, 2K, 4K
        )
    )
)
```

### With Google Search Grounding
```python
response = client.models.generate_content(
    model="gemini-3-pro-image-preview",
    contents="Create an infographic showing today's stock market trends",
    config=types.GenerateContentConfig(
        response_modalities=['TEXT', 'IMAGE'],
        tools=[{"google_search": {}}]  # Enable search grounding
    )
)
```

## Multi-Turn Conversations (Iterative Editing)

```python
# Create a chat session
chat = client.chats.create(
    model="gemini-3-pro-image-preview",
    config=types.GenerateContentConfig(
        response_modalities=['TEXT', 'IMAGE'],
        tools=[{"google_search": {}}]
    )
)

# Initial generation
response1 = chat.send_message(
    "Create a vibrant infographic explaining photosynthesis"
)

# Edit the image
response2 = chat.send_message(
    "Update this infographic to be in Spanish. Keep all other elements the same."
)
```

## Key Capabilities

### 1. Superior Text Rendering
```python
response = client.models.generate_content(
    model="gemini-3-pro-image-preview",
    contents="""Create a professional poster with:
    - Title: "Annual Tech Summit 2025"
    - Date: March 15-17, 2025
    - Location: San Francisco Convention Center
    """,
    config=types.GenerateContentConfig(
        response_modalities=['TEXT', 'IMAGE']
    )
)
```

### 2. Character Consistency (Up to 5 Subjects)
```python
import base64

def load_image(path: str) -> str:
    with open(path, "rb") as f:
        return base64.b64encode(f.read()).decode()

character_ref = load_image("character.png")

response = client.models.generate_content(
    model="gemini-3-pro-image-preview",
    contents=[
        {"text": "Generate an image of this person at a tech conference"},
        {"inline_data": {"mime_type": "image/png", "data": character_ref}}
    ],
    config=types.GenerateContentConfig(
        response_modalities=['TEXT', 'IMAGE']
    )
)
```

## Next.js API Route

```typescript
// app/api/generate-image/route.ts
import { NextRequest, NextResponse } from 'next/server';

export async function POST(request: NextRequest) {
  const { prompt, aspectRatio = '1:1', imageSize = '2K' } = await request.json();

  try {
    const response = await fetch(
      'https://generativelanguage.googleapis.com/v1beta/models/gemini-3-pro-image-preview:generateContent',
      {
        method: 'POST',
        headers: {
          'x-goog-api-key': process.env.GEMINI_API_KEY!,
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          contents: [{ role: 'user', parts: [{ text: prompt }] }],
          generationConfig: {
            responseModalities: ['TEXT', 'IMAGE'],
            imageConfig: { aspectRatio, imageSize },
          },
        }),
      }
    );

    const data = await response.json();
    const parts = data.candidates?.[0]?.content?.parts || [];
    const imagePart = parts.find((p: any) => p.inline_data);

    return NextResponse.json({
      image: imagePart ? {
        data: imagePart.inline_data.data,
        mimeType: imagePart.inline_data.mime_type,
        url: `data:${imagePart.inline_data.mime_type};base64,${imagePart.inline_data.data}`,
      } : null,
    });
  } catch (error) {
    return NextResponse.json({ error: 'Generation failed' }, { status: 500 });
  }
}
```

## Model Comparison

| Feature | Nano Banana (2.5 Flash) | Nano Banana Pro (3 Pro Image) |
|---------|-------------------------|-------------------------------|
| Model ID | gemini-2.5-flash-image | gemini-3-pro-image-preview |
| Quality | Good | Best |
| Speed | Faster | Slower |
| Cost | Lower | Higher |
| Best For | Previews, high-volume | Production, professional |

## Resources

- **Documentation**: https://ai.google.dev/gemini-api/docs/image-generation
- **Google AI Studio**: https://aistudio.google.com
- **Prompt Guide**: https://ai.google.dev/gemini-api/docs/prompting-intro

Overview

This skill generates high-quality images using Google's Nano Banana Pro (Gemini 3 Pro Image) model. It provides ready-to-use patterns and code examples for image creation, iterative editing, and production-grade visuals via the Gemini API. Use it to add professional image generation to apps, marketing workflows, or automated design pipelines.

How this skill works

The skill calls the Gemini 3 Pro Image endpoint to produce images and optional text outputs. It supports direct prompts, inline image references for character consistency, configurable aspect ratios and resolutions, and multi-turn chat sessions for iterative edits. Responses include base64 image data and metadata that you can save, preview, or return from server routes.

When to use it

  • Generating production-quality marketing visuals, posters, and infographics.
  • Creating consistent character or product shots using reference images.
  • Building image generation features in web or mobile apps (APIs or server routes).
  • Iteratively editing images via multi-turn conversations (e.g., refine language or composition).
  • Embedding text-rendered graphics like event posters or UI mockups.

Best practices

  • Store the Gemini API key securely as an environment variable and never hard-code it.
  • Request both TEXT and IMAGE modalities to receive captions and metadata alongside images.
  • Use image_config (aspect ratio, image_size) to match downstream display or print requirements.
  • Provide inline reference images to preserve character consistency across multiple outputs.
  • Use chat sessions for iterative edits instead of regenerating from scratch to retain layout context.

Example use cases

  • Next.js API route that returns a base64 data URL for client previews.
  • Automated generation of product photos with consistent lighting and framing for an ecommerce catalog.
  • Creating multilingual infographics by generating a base image and then editing text layers in follow-up messages.
  • Design prototyping: produce multiple aspect ratios (1:1, 16:9, 9:16) from one prompt for cross-platform assets.

FAQ

What response formats does the model return?

Responses can include TEXT and IMAGE parts; images are returned as base64 inline_data with mime_type (e.g., image/png).

How do I maintain character consistency across images?

Upload a reference image as inline_data and include it in the same generation call; the model can keep character consistency for up to five subjects.