home / skills / bahayonghang / my-claude-code-settings / gemini-image

gemini-image skill

needs review

/content/skills/ai-llm-skills/gemini-image

This skill helps you generate AI images from text or existing visuals by constructing prompts and returning image URLs for creative tasks.

npx playbooks add skill bahayonghang/my-claude-code-settings --skill gemini-image

Review the files below or copy the command above to add this skill to your agents.

Files (4)

SKILL.md

1.4 KB

---
name: gemini-image
description: Generate images using AI when user wants to create pictures, draw, paint, or generate artwork. Supports text-to-image and image-to-image generation.
category: content-creation
tags: [image-generation, ai-art, text-to-image, gemini]
---

# Gemini Image Generation

Use this skill when user expresses intent to generate images (e.g., "draw a...", "generate an image...", "create a picture...").

## Steps

### 1. Read Configuration
- Read `config/secrets.md` to get API Key

### 2. Construct Prompt

| Mode | Prompt Format | Example |
|------|---------------|---------|
| Text-to-Image | `description text` | `a cute orange cat` |
| Image-to-Image | `image_URL description` | `https://xxx.jpg draw similar style` |
| Multi-Image Reference | `URL1 URL2 description` | `https://a.jpg https://b.jpg merge these two` |

For image-to-image, upload local images first. See `tips/image-upload.md`.

### 3. Call API

```bash
curl -s -X POST "https://api.apicore.ai/v1/images/generations" \
  -H "Authorization: Bearer API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "model_name",
    "prompt": "prompt_text",
    "size": "aspect_ratio",
    "n": 1
  }'
```

### 4. Return Result

Extract `data[0].url` from response and return to user.

## Reference Docs

- `tips/image-upload.md` - Image upload methods
- `tips/chinese-text.md` - Chinese text handling tips

Overview

This skill generates images with an AI image model for text-to-image and image-to-image workflows. It accepts simple descriptive prompts, multi-image references, or a source image plus instructions, then returns generated image URLs. It is designed for quick creative iterations and direct API integration.

How this skill works

The skill builds a prompt depending on the mode: plain description for text-to-image, one or more image URLs plus instructions for image-to-image or multi-image references. It sends a generation request to the image API with model, prompt, size, and count parameters using the configured API key. The response is parsed and the first result URL is returned to the user for download or preview.

When to use it

You want an original illustration, painting, or concept art from text.
You have a reference image and want a variation or style transfer.
You want to merge or combine multiple images into a new composition.
You need quick mockups, thumbnails, or creative assets for a project.

Best practices

Use concise, descriptive prompts including style, mood, and composition (e.g., "cinematic, 35mm, warm lighting").
For image-to-image, upload source images first and include their accessible URLs in the prompt.
Provide aspect ratio or size explicitly to control framing and resolution.
Request multiple candidates (n>1) when experimenting to explore variations.
Store and reuse your API key securely; avoid embedding it in client-side code.

Example use cases

Generate a character portrait from text: "young explorer, watercolor, soft light".
Create a variant of a product photo by supplying the original image URL and: "change background to white, modern studio lighting".
Merge two concept sketches into a single composition using their URLs and: "combine styles, unify color palette".
Produce social media artwork sized for a specific aspect ratio for quick campaign assets.

FAQ

What prompt formats are supported?

Plain descriptive text for text-to-image, one image URL plus description for image-to-image, or multiple URLs followed by instructions for multi-image references.

How do I use local images as references?

Upload local images to a hosting endpoint or the provided upload method, then include the resulting URL(s) in the prompt.

What part of the API response do I return to the user?

Extract the first generated image URL from the response payload (commonly data[0].url) and return it for preview or download.