home / skills / sanjay3290 / ai-skills / imagen

imagen skill

safe

This skill generates images using Google Gemini to create UI mockups, icons, diagrams, and illustrations for any project.

npx playbooks add skill sanjay3290/ai-skills --skill imagen

Review the files below or copy the command above to add this skill to your agents.

Files (7)

SKILL.md

2.6 KB

---
name: imagen
description: |
  Generate images using Google Gemini's image generation capabilities.
  Use this skill when the user needs to create, generate, or produce images
  for any purpose including UI mockups, icons, illustrations, diagrams,
  concept art, placeholder images, or visual representations.
license: Apache-2.0
metadata:
  author: sanjay3290
  version: "1.0"
---

# Imagen - AI Image Generation Skill

## Overview

This skill generates images using Google Gemini's image generation model (`gemini-3-pro-image-preview`). It enables seamless image creation during any Claude Code session - whether you're building frontend UIs, creating documentation, or need visual representations of concepts.

**Cross-Platform**: Works on Windows, macOS, and Linux.

## When to Use This Skill

Automatically activate this skill when:
- User requests image generation (e.g., "generate an image of...", "create a picture...")
- Frontend development requires placeholder or actual images
- Documentation needs illustrations or diagrams
- Visualizing concepts, architectures, or ideas
- Creating icons, logos, or UI assets
- Any task where an AI-generated image would be helpful

## How It Works

1. Takes a text prompt describing the desired image
2. Calls Google Gemini API with image generation configuration
3. Saves the generated image to a specified location (defaults to current directory)
4. Returns the file path for use in your project

## Usage

### Python (Cross-Platform - Recommended)

```bash
# Basic usage
python scripts/generate_image.py "A futuristic city skyline at sunset"

# With custom output path
python scripts/generate_image.py "A minimalist app icon for a music player" "./assets/icons/music-icon.png"

# With custom size
python scripts/generate_image.py --size 2K "High resolution landscape" "./wallpaper.png"
```

## Requirements

- `GEMINI_API_KEY` environment variable must be set
- Python 3.6+ (uses standard library only, no pip install needed)

## Output

Generated images are saved as PNG files. The script returns:
- Success: Path to the generated image
- Failure: Error message with details

## Examples

### Frontend Development
```
User: "I need a hero image for my landing page - something abstract and tech-focused"
-> Generates and saves image, provides path for use in HTML/CSS
```

### Documentation
```
User: "Create a diagram showing microservices architecture"
-> Generates visual representation, ready for README or docs
```

### UI Assets
```
User: "Generate a placeholder avatar image for the user profile component"
-> Creates image in appropriate size for component use
```

Overview

This skill generates images using Google Gemini's image generation model to produce PNG assets for UI mockups, icons, illustrations, diagrams, concept art, and placeholders. It integrates into coding sessions to create visual assets on demand and returns file paths for immediate use in projects. Cross-platform support works on Windows, macOS, and Linux with a simple Python script and an environment API key.

How this skill works

You provide a text prompt describing the desired image and optional parameters (output path, size). The skill calls the Google Gemini image generation API (gemini-3-pro-image-preview), saves the returned PNG to the specified location, and returns the file path or an error message. Configuration uses the GEMINI_API_KEY environment variable and a lightweight Python script for local execution.

When to use it

Generate hero images, backgrounds, or concept art for web and app design
Create icons, avatars, or UI assets when building frontend components
Produce diagrams or visual representations for documentation and READMEs
Generate placeholder images for layouts, prototypes, and demos
Create visual ideas, mood boards, or quick concept explorations

Best practices

Craft clear, specific prompts including style, color palette, and composition for predictable results
Specify output size or resolution for production vs placeholder needs
Save generated images into project asset folders and use version control for selected assets only
Use iterative prompts: adjust and re-run to refine details or try multiple variations
Keep API keys secure by storing GEMINI_API_KEY in environment variables and not in source files

Example use cases

Frontend development: generate a hero image and return a path for HTML/CSS inclusion
Documentation: create a microservices architecture diagram as a PNG for the docs folder
UI assets: produce several icon variations for a toolbar or mobile app
Prototyping: create placeholder avatars and landscape images for mockups
Concept art: generate multiple mood variations to choose a direction for visual design

FAQ

What output formats are produced?

The skill outputs PNG files by default.

How do I control image resolution?

Pass a size parameter to the script (e.g., 2K) or include resolution details in the prompt; the script accepts common size flags for the Gemini call.