home / skills / aidotnet / moyucode / puppeteer

puppeteer skill

/skills/tools/puppeteer

This skill automates browser tasks with Puppeteer for web scraping, PDF generation, screenshots, and automated testing.

npx playbooks add skill aidotnet/moyucode --skill puppeteer

Review the files below or copy the command above to add this skill to your agents.

Files (1)
SKILL.md
3.4 KB
---
name: puppeteer
description: 使用Puppeteer(Google)进行浏览器自动化和PDF生成。支持无头Chrome控制,用于网页爬虫、截图、PDF生成和自动化测试。
metadata:
  short-description: 浏览器自动化和PDF生成
source:
  repository: https://github.com/puppeteer/puppeteer
  license: Apache-2.0
  stars: 89k+
---

# Puppeteer Tool

## Description
Headless Chrome/Chromium automation for PDF generation, screenshots, web scraping, and testing.

## Source
- Repository: [puppeteer/puppeteer](https://github.com/puppeteer/puppeteer)
- License: Apache-2.0
- Maintainer: Google

## Installation

```bash
npm install puppeteer
```

## Usage Examples

### Generate PDF from HTML

```typescript
import puppeteer from 'puppeteer';

async function generatePDF(html: string, outputPath: string) {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  
  await page.setContent(html, { waitUntil: 'networkidle0' });
  
  await page.pdf({
    path: outputPath,
    format: 'A4',
    margin: { top: '20mm', right: '20mm', bottom: '20mm', left: '20mm' },
    printBackground: true,
  });
  
  await browser.close();
}

// Usage
const html = `
  <html>
    <head><style>body { font-family: Arial; }</style></head>
    <body><h1>Invoice #001</h1><p>Total: $100.00</p></body>
  </html>
`;
await generatePDF(html, 'invoice.pdf');
```

### Take Screenshot

```typescript
async function takeScreenshot(url: string, outputPath: string) {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  
  await page.setViewport({ width: 1920, height: 1080 });
  await page.goto(url, { waitUntil: 'networkidle2' });
  
  await page.screenshot({
    path: outputPath,
    fullPage: true,
    type: 'png',
  });
  
  await browser.close();
}
```

### Web Scraping

```typescript
async function scrapeData(url: string) {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  
  await page.goto(url, { waitUntil: 'domcontentloaded' });
  
  const data = await page.evaluate(() => {
    const items = document.querySelectorAll('.product');
    return Array.from(items).map(item => ({
      title: item.querySelector('h2')?.textContent?.trim(),
      price: item.querySelector('.price')?.textContent?.trim(),
    }));
  });
  
  await browser.close();
  return data;
}
```

### Form Automation

```typescript
async function submitForm(url: string, formData: Record<string, string>) {
  const browser = await puppeteer.launch({ headless: false });
  const page = await browser.newPage();
  
  await page.goto(url);
  
  // Fill form fields
  for (const [selector, value] of Object.entries(formData)) {
    await page.type(selector, value);
  }
  
  // Submit
  await page.click('button[type="submit"]');
  await page.waitForNavigation();
  
  await browser.close();
}
```

## PDF Options

```typescript
interface PDFOptions {
  path?: string;
  scale?: number;                    // 0.1 - 2, default 1
  displayHeaderFooter?: boolean;
  headerTemplate?: string;
  footerTemplate?: string;
  printBackground?: boolean;
  landscape?: boolean;
  pageRanges?: string;               // '1-5, 8, 11-13'
  format?: 'Letter' | 'Legal' | 'A4' | 'A3';
  width?: string;
  height?: string;
  margin?: { top, right, bottom, left };
}
```

## Tags
`browser`, `pdf`, `screenshot`, `automation`, `scraping`

## Compatibility
- Codex: ✅
- Claude Code: ✅

Overview

This skill uses Puppeteer to automate Chrome/Chromium for tasks like PDF generation, screenshots, web scraping, and form automation. It targets Node.js/TypeScript projects and supports headless and headed browser flows for reliable, scriptable browser interaction. Use it to produce printable PDFs, capture visual regressions, extract page data, or automate repetitive web tasks.

How this skill works

The skill launches a Chrome/Chromium instance, creates pages, navigates to URLs or sets HTML content, and runs actions inside the page context. It exposes APIs for rendering PDFs (with margins, header/footer and page ranges), taking screenshots, evaluating DOM code for scraping, and interacting with forms and controls. You can run Puppeteer headless for CI or headed for debugging and visual verification.

When to use it

  • Generate printable PDFs from HTML or live pages for invoices, reports, or certificates.
  • Capture full-page or viewport screenshots for visual testing and monitoring.
  • Scrape dynamic content where server-side requests miss client-rendered data.
  • Automate form submissions, login flows, and end-to-end browser tests.
  • Render complex styles or JavaScript-driven pages before saving or processing.

Best practices

  • Reuse browser and page instances where possible to reduce startup overhead.
  • Use waitUntil options (networkidle/domcontentloaded) to ensure the page is ready before capture.
  • Run headless in CI and headed locally for debugging; enable slowMo or devtools when diagnosing flakiness.
  • Sanitize and limit scraping frequency to avoid overloading target sites and respect robots/terms.
  • Set explicit viewports, margins and printBackground when generating consistent PDFs or screenshots.

Example use cases

  • Generate A4 invoices with branded header/footer and print-ready margins from HTML templates.
  • Take scheduled screenshots of landing pages to monitor visual drift or availability.
  • Scrape product listings that are populated by client-side JavaScript frameworks.
  • Automate login and data entry for legacy systems without APIs to reduce manual work.
  • Run headless browser tests in CI to validate critical user flows and visual snapshots.

FAQ

Do I need a separate Chrome install?

Puppeteer downloads a compatible Chromium by default, but you can configure it to use a system Chrome by setting the executablePath option.

How do I make PDFs include CSS backgrounds?

Enable printBackground: true in the PDF options and ensure styles are available before calling page.pdf, using waitUntil to wait for resources.