home / skills / bahayonghang / my-claude-code-settings / gemini-image
This skill helps you generate AI images from text or existing visuals by constructing prompts and returning image URLs for creative tasks.
npx playbooks add skill bahayonghang/my-claude-code-settings --skill gemini-imageReview the files below or copy the command above to add this skill to your agents.
---
name: gemini-image
description: Generate images using AI when user wants to create pictures, draw, paint, or generate artwork. Supports text-to-image and image-to-image generation.
category: content-creation
tags: [image-generation, ai-art, text-to-image, gemini]
---
# Gemini Image Generation
Use this skill when user expresses intent to generate images (e.g., "draw a...", "generate an image...", "create a picture...").
## Steps
### 1. Read Configuration
- Read `config/secrets.md` to get API Key
### 2. Construct Prompt
| Mode | Prompt Format | Example |
|------|---------------|---------|
| Text-to-Image | `description text` | `a cute orange cat` |
| Image-to-Image | `image_URL description` | `https://xxx.jpg draw similar style` |
| Multi-Image Reference | `URL1 URL2 description` | `https://a.jpg https://b.jpg merge these two` |
For image-to-image, upload local images first. See `tips/image-upload.md`.
### 3. Call API
```bash
curl -s -X POST "https://api.apicore.ai/v1/images/generations" \
-H "Authorization: Bearer API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "model_name",
"prompt": "prompt_text",
"size": "aspect_ratio",
"n": 1
}'
```
### 4. Return Result
Extract `data[0].url` from response and return to user.
## Reference Docs
- `tips/image-upload.md` - Image upload methods
- `tips/chinese-text.md` - Chinese text handling tips
This skill generates images with an AI image model for text-to-image and image-to-image workflows. It accepts simple descriptive prompts, multi-image references, or a source image plus instructions, then returns generated image URLs. It is designed for quick creative iterations and direct API integration.
The skill builds a prompt depending on the mode: plain description for text-to-image, one or more image URLs plus instructions for image-to-image or multi-image references. It sends a generation request to the image API with model, prompt, size, and count parameters using the configured API key. The response is parsed and the first result URL is returned to the user for download or preview.
What prompt formats are supported?
Plain descriptive text for text-to-image, one image URL plus description for image-to-image, or multiple URLs followed by instructions for multi-image references.
How do I use local images as references?
Upload local images to a hosting endpoint or the provided upload method, then include the resulting URL(s) in the prompt.
What part of the API response do I return to the user?
Extract the first generated image URL from the response payload (commonly data[0].url) and return it for preview or download.