home / skills / openclaw / skills / peekaboo

peekaboo skill

/skills/steipete/peekaboo

This skill automates macOS UI tasks with the Peekaboo CLI, enabling capture, targeting, and scripted interactions across apps.

npx playbooks add skill openclaw/skills --skill peekaboo

Review the files below or copy the command above to add this skill to your agents.

Files (2)
SKILL.md
5.6 KB
---
name: peekaboo
description: Capture and automate macOS UI with the Peekaboo CLI.
homepage: https://peekaboo.boo
metadata: {"clawdbot":{"emoji":"👀","os":["darwin"],"requires":{"bins":["peekaboo"]},"install":[{"id":"brew","kind":"brew","formula":"steipete/tap/peekaboo","bins":["peekaboo"],"label":"Install Peekaboo (brew)"}]}}
---

# Peekaboo

Peekaboo is a full macOS UI automation CLI: capture/inspect screens, target UI
elements, drive input, and manage apps/windows/menus. Commands share a snapshot
cache and support `--json`/`-j` for scripting. Run `peekaboo` or
`peekaboo <cmd> --help` for flags; `peekaboo --version` prints build metadata.
Tip: run via `polter peekaboo` to ensure fresh builds.

## Features (all CLI capabilities, excluding agent/MCP)

Core
- `bridge`: inspect Peekaboo Bridge host connectivity
- `capture`: live capture or video ingest + frame extraction
- `clean`: prune snapshot cache and temp files
- `config`: init/show/edit/validate, providers, models, credentials
- `image`: capture screenshots (screen/window/menu bar regions)
- `learn`: print the full agent guide + tool catalog
- `list`: apps, windows, screens, menubar, permissions
- `permissions`: check Screen Recording/Accessibility status
- `run`: execute `.peekaboo.json` scripts
- `sleep`: pause execution for a duration
- `tools`: list available tools with filtering/display options

Interaction
- `click`: target by ID/query/coords with smart waits
- `drag`: drag & drop across elements/coords/Dock
- `hotkey`: modifier combos like `cmd,shift,t`
- `move`: cursor positioning with optional smoothing
- `paste`: set clipboard -> paste -> restore
- `press`: special-key sequences with repeats
- `scroll`: directional scrolling (targeted + smooth)
- `swipe`: gesture-style drags between targets
- `type`: text + control keys (`--clear`, delays)

System
- `app`: launch/quit/relaunch/hide/unhide/switch/list apps
- `clipboard`: read/write clipboard (text/images/files)
- `dialog`: click/input/file/dismiss/list system dialogs
- `dock`: launch/right-click/hide/show/list Dock items
- `menu`: click/list application menus + menu extras
- `menubar`: list/click status bar items
- `open`: enhanced `open` with app targeting + JSON payloads
- `space`: list/switch/move-window (Spaces)
- `visualizer`: exercise Peekaboo visual feedback animations
- `window`: close/minimize/maximize/move/resize/focus/list

Vision
- `see`: annotated UI maps, snapshot IDs, optional analysis

Global runtime flags
- `--json`/`-j`, `--verbose`/`-v`, `--log-level <level>`
- `--no-remote`, `--bridge-socket <path>`

## Quickstart (happy path)
```bash
peekaboo permissions
peekaboo list apps --json
peekaboo see --annotate --path /tmp/peekaboo-see.png
peekaboo click --on B1
peekaboo type "Hello" --return
```

## Common targeting parameters (most interaction commands)
- App/window: `--app`, `--pid`, `--window-title`, `--window-id`, `--window-index`
- Snapshot targeting: `--snapshot` (ID from `see`; defaults to latest)
- Element/coords: `--on`/`--id` (element ID), `--coords x,y`
- Focus control: `--no-auto-focus`, `--space-switch`, `--bring-to-current-space`,
  `--focus-timeout-seconds`, `--focus-retry-count`

## Common capture parameters
- Output: `--path`, `--format png|jpg`, `--retina`
- Targeting: `--mode screen|window|frontmost`, `--screen-index`,
  `--window-title`, `--window-id`
- Analysis: `--analyze "prompt"`, `--annotate`
- Capture engine: `--capture-engine auto|classic|cg|modern|sckit`

## Common motion/typing parameters
- Timing: `--duration` (drag/swipe), `--steps`, `--delay` (type/scroll/press)
- Human-ish movement: `--profile human|linear`, `--wpm` (typing)
- Scroll: `--direction up|down|left|right`, `--amount <ticks>`, `--smooth`

## Examples
### See -> click -> type (most reliable flow)
```bash
peekaboo see --app Safari --window-title "Login" --annotate --path /tmp/see.png
peekaboo click --on B3 --app Safari
peekaboo type "[email protected]" --app Safari
peekaboo press tab --count 1 --app Safari
peekaboo type "supersecret" --app Safari --return
```

### Target by window id
```bash
peekaboo list windows --app "Visual Studio Code" --json
peekaboo click --window-id 12345 --coords 120,160
peekaboo type "Hello from Peekaboo" --window-id 12345
```

### Capture screenshots + analyze
```bash
peekaboo image --mode screen --screen-index 0 --retina --path /tmp/screen.png
peekaboo image --app Safari --window-title "Dashboard" --analyze "Summarize KPIs"
peekaboo see --mode screen --screen-index 0 --analyze "Summarize the dashboard"
```

### Live capture (motion-aware)
```bash
peekaboo capture live --mode region --region 100,100,800,600 --duration 30 \
  --active-fps 8 --idle-fps 2 --highlight-changes --path /tmp/capture
```

### App + window management
```bash
peekaboo app launch "Safari" --open https://example.com
peekaboo window focus --app Safari --window-title "Example"
peekaboo window set-bounds --app Safari --x 50 --y 50 --width 1200 --height 800
peekaboo app quit --app Safari
```

### Menus, menubar, dock
```bash
peekaboo menu click --app Safari --item "New Window"
peekaboo menu click --app TextEdit --path "Format > Font > Show Fonts"
peekaboo menu click-extra --title "WiFi"
peekaboo dock launch Safari
peekaboo menubar list --json
```

### Mouse + gesture input
```bash
peekaboo move 500,300 --smooth
peekaboo drag --from B1 --to T2
peekaboo swipe --from-coords 100,500 --to-coords 100,200 --duration 800
peekaboo scroll --direction down --amount 6 --smooth
```

### Keyboard input
```bash
peekaboo hotkey --keys "cmd,shift,t"
peekaboo press escape
peekaboo type "Line 1\nLine 2" --delay 10
```

Notes
- Requires Screen Recording + Accessibility permissions.
- Use `peekaboo see --annotate` to identify targets before clicking.

Overview

This skill provides a macOS UI automation CLI that captures and drives the UI using Peekaboo. It lets you inspect screens, target UI elements, control mouse/keyboard gestures, and manage apps, windows, menus and clipboard for reliable automation. Outputs JSON for scripting and shares a snapshot cache to chain commands.

How this skill works

Peekaboo captures screen or video frames to build snapshots and annotated UI maps, then targets elements by ID, query or coordinates. Interaction commands (click, type, drag, swipe, hotkey, etc.) operate against snapshots or live targets and include smart waits, focus controls, and motion smoothing. Most commands accept --json for machine-readable output and common flags to control targeting, timing, and capture behavior.

When to use it

  • Automating repetitive macOS UI tasks across applications and windows.
  • End-to-end testing that needs visual targeting or interaction with system menus and dialogs.
  • Scripted workflows that require screenshot capture, visual analysis, or annotated UI maps.
  • Remote or CI automation that must validate UI state before performing input.
  • Creating reproducible sequences for demos, onboarding, or troubleshooting GUI issues.

Best practices

  • Run peekaboo see --annotate to identify element IDs before issuing click/type commands.
  • Use --snapshot to target a stable captured frame when working with dynamic content.
  • Enable Screen Recording and Accessibility permissions before running automation.
  • Prefer element IDs or queries over raw coordinates for resilience across resolutions.
  • Combine --json output with scripts to validate responses and retry on transient failures.

Example use cases

  • Log into a web app: capture the login window, click the username field, type credentials, press return.
  • Automated screenshot collection: capture screen or specific windows with --retina and annotate for QA.
  • Move and resize windows by listing windows, selecting by ID, then setting bounds for consistent layouts.
  • Automated menu interactions: list application menus and click deep menu paths like Format > Font > Show Fonts.
  • Live capture of a region to extract frames, highlight changes, and save motion-aware recordings.

FAQ

What permissions are required to use Peekaboo?

Peekaboo needs Screen Recording and Accessibility permissions enabled in System Preferences to capture the UI and send input events.

How do I target a specific element reliably?

Use peekaboo see --annotate to get element IDs, then use --on or --id with snapshot targeting; prefer IDs/queries over coordinates.