home / skills / inclusionai / aworld / html-to-image

html-to-image skill

safe

/examples/skill_agent/skills/html-to-image

This skill renders HTML content into images by driving agent-browser and capturing screenshots for ready-to-publish visuals.

npx playbooks add skill inclusionai/aworld --skill html-to-image

Review the files below or copy the command above to add this skill to your agents.

Files (2)

SKILL.md

1.9 KB

---
name: html-to-image
description: HTML 转图片 skill - 将 HTML 文件或内容通过 agent-browser 渲染并截图为图片。适用于生成信息图、社交媒体配图、数据可视化截图等场景。
---

# HTML 转图片 (html-to-image)

## 概述

将 HTML 文件或内容通过 agent-browser 渲染为图片。典型用法：Claude 生成精美 HTML → 本 skill 截图 → 得到可直接发布的图片。

## 工具路径

- 脚本：`.claude/skills/html-to-image/html_to_image.sh`
- 依赖：`agent-browser`（CDP 已连接）、`python3`

## 用法

```bash
./html_to_image.sh -o <output> [-f <html_file> | -c <html_content>] [-w <width>] [-p <cdp_port>] [--full]
```

### 参数

| 参数 | 说明 | 必填 | 默认 |
|------|------|------|------|
| `-o` | 输出图片路径（.png） | 是 | - |
| `-f` | HTML 文件路径（与 `-c` 二选一） | 二选一 | - |
| `-c` | HTML 内容字符串（与 `-f` 二选一） | 二选一 | - |
| `-w` | 视口宽度 | 否 | 1080 |
| `-e` | 视口高度（不指定则全页截图） | 否 | - |
| `-p` | CDP 端口 | 否 | 9222 |
| `--full` | 全页截图（忽略视口高度限制） | 否 | 默认开启 |

### 示例

```bash
# 从 HTML 文件截图
./html_to_image.sh \
  -f card.html -o card.png

# 直接传入 HTML 内容
./html_to_image.sh \
  -c '<html><body><h1>Hello</h1></body></html>' \
  -o hello.png

# 指定宽度（适配手机尺寸）
./html_to_image.sh \
  -f infographic.html -o output.png -w 750

# 固定视口截图（非全页）
./html_to_image.sh \
  -f page.html -o output.png -w 1080 -e 1920 --no-full
```

### 典型工作流

1. Claude 根据内容生成精美 HTML（信息图、卡片等）
2. 使用本 skill 截图为 PNG
3. 将截图传给 `xhs-publisher` 发布到小红书

```bash
# 生成图片
./html_to_image.sh -f card.html -o card.png

# 发布到小红书
./.claude/skills/xhs-publisher/publish_xhs.sh -t "标题" -c "正文" -i card.png
```

Overview

This skill converts HTML files or inline HTML content into PNG images by rendering them in an agent-connected browser and taking a screenshot. It is designed to turn generated HTML (cards, infographics, visualizations) into publishable images quickly. The tool is script-driven and works with a Chromium Debugging Protocol (CDP) connection.

How this skill works

The script loads given HTML (from a file or a content string) into an agent-browser instance connected via CDP, sets a viewport size, and captures a screenshot. You can request a viewport-limited capture or a full-page screenshot. The output is a PNG file written to the specified path.

When to use it

Generate shareable images from programmatically produced HTML
Create social media cards or blog visuals from HTML templates
Capture data visualizations rendered in-browser for reports
Batch-convert HTML snippets into images for publishing workflows

Best practices

Provide complete HTML including styles and fonts to ensure consistent rendering
Prefer full-page screenshots for long content; use viewport height for fixed-size artboards
Test viewport width and height locally to match target platform aspect ratios
Ensure agent-browser/CDP is running and accessible on the specified port before invoking the script
Use semantic, self-contained HTML so screenshots do not depend on external network resources

Example use cases

Turn Claude-generated HTML templates into PNGs for social posts
Snapshot interactive charts or dashboards into static images for documents
Produce thumbnails or preview images for web content pipelines
Create sized images for mobile feeds by setting a mobile-width viewport
Integrate into a publish workflow: generate HTML → capture PNG → upload to a platform

FAQ

What inputs are supported?

You can provide an HTML file path or pass HTML content as a string; one of these is required.

How do I control output dimensions?

Use -w to set viewport width and -e for viewport height. Omit -e or use --full to capture the full page height.

What if the browser is not reachable?

Confirm agent-browser is running and CDP is listening on the port (default 9222). Adjust -p if using a different port.