home / skills / qwenlm / qwen-code-examples / image-generate

image-generate skill

/skills/image-generate

This skill generates high-quality hand-drawn or standard images from user prompts, with smart prompt optimization and auto-save features.

npx playbooks add skill qwenlm/qwen-code-examples --skill image-generate

Review the files below or copy the command above to add this skill to your agents.

Files (2)
SKILL.md
2.2 KB
---
name: image-generation
description: Image generation skill based on Alibaba Cloud DashScope, supporting the creation of high-quality hand-drawn or standard images from user descriptions.
version: 1.0.0
license: MIT
---

# Image Generation Skill

This skill allows agents to automatically generate high-quality images (defaulting to hand-drawn style) based on user intent.

## Core Features

- **Smart Prompt Optimization**: Transforms simple user intent into detailed hand-drawn style prompts.
- **Fast Generation**: Uses non-streaming interfaces to significantly speed up image generation.
- **Auto-Save**: Automatically downloads generated images locally and saves metadata and API responses simultaneously.

## Prerequisites

Before using this skill, ensure that the `DASHSCOPE_API_KEY` environment variable is set:

```bash
export DASHSCOPE_API_KEY="Your API Key"
```

## User Guide

### 1. Refine the Prompt

You need to refine the user's original intent into a prompt suitable for image generation. For hand-drawn versions of architecture or flowcharts, it's recommended to include keywords like "hand-drawn", "sketch", "architectural drawing", etc.

### 2. Run the Script

Use the following command to call the generation script:

```bash
node skills/image-generate/scripts/generate_image.js "Your detailed prompt"
```

### 3. View Results

After the script completes, it will generate the following files in the current directory:

- `image_YYYY-MM-DDTHH-mm-ss.png`: The generated image file.
- `metadata_YYYY-MM-DDTHH-mm-ss.json`: Metadata including prompt, file size, and duration.
- `response_YYYY-MM-DDTHH-mm-ss.json`: Raw API response data (for debugging).

## Example

**User Intent**: "Help me draw an architecture diagram of an AI coding assistant."

**Recommended Prompt**: "A detailed hand-drawn architectural diagram of an AI coding assistant, showing the interaction between the user, the IDE, and the LLM, technical sketch style, clean lines, white background."

**Execution Command**:

```bash
node skills/image-generate/scripts/generate_image.js "A detailed hand-drawn architectural diagram of an AI coding assistant, showing the interaction between the user, the IDE, and the LLM, technical sketch style, clean lines, white background."
```

Overview

This skill generates high-quality images from user descriptions using Alibaba Cloud DashScope, with a default hand-drawn aesthetic. It converts simple intents into detailed prompts, produces images quickly, and auto-saves outputs and metadata for easy retrieval and debugging.

How this skill works

The skill refines a short user intent into a detailed, style-aware prompt (e.g., hand-drawn, sketch, architectural). It calls DashScope's non-streaming generation API for faster results, downloads the image locally, and writes metadata and raw API responses to timestamped files for traceability. Scripts are provided to run generation from the command line.

When to use it

  • Create hand-drawn style diagrams, sketches, or architectural flowcharts from a short description.
  • Generate quick visual mockups or conceptual illustrations for documentation or presentations.
  • Produce clean, white-background technical illustrations for reports or tutorials.
  • Automate bulk generation where metadata and raw responses must be logged for auditing.
  • Prototype UI/UX concepts with consistent hand-drawn or standard rendering styles.

Best practices

  • Include explicit style keywords in the prompt (e.g., hand-drawn, sketch, architectural drawing) to get the intended look.
  • Be specific about composition and elements: mention subjects, layout, perspective, and background.
  • Use the provided timestamped auto-save to keep image files, metadata, and API responses for debugging.
  • Test prompts iteratively and keep a small library of effective templates for recurring requests.
  • Set DASHSCOPE_API_KEY as an environment variable before running scripts to avoid runtime failures.

Example use cases

  • Generate a hand-drawn architecture diagram showing interactions between a user, IDE, and LLM for a design doc.
  • Create conceptual illustrations for onboarding slides with a consistent sketch style.
  • Produce simple technical flowcharts and export accompanying metadata for change tracking.
  • Batch-generate visual assets while logging API responses for QA and reproducibility.

FAQ

What environment setup is required?

Set the DASHSCOPE_API_KEY environment variable before running scripts so the tool can authenticate with the DashScope API.

How are generated files organized?

Each run saves three timestamped files in the current directory: the PNG image, a metadata JSON, and the raw API response JSON for debugging.