home / skills / supercent-io / skills-template / agent-browser
This skill enables deterministic web automation with a headless browser, using accessibility tree refs for reliable interactions and isolated sessions.
npx playbooks add skill supercent-io/skills-template --skill agent-browserReview the files below or copy the command above to add this skill to your agents.
---
name: agent-browser
description: Fast headless browser CLI for AI agents. Supports deterministic element selection via accessibility tree snapshots and refs (@e1, @e2).
allowed-tools: [Read, Write, Bash, Grep, Glob]
tags: [browser-automation, headless-browser, ai-agent, playwright, web-scraping]
platforms: [Claude, Gemini, Codex, ChatGPT]
version: 1.0.0
source: vercel-labs/agent-browser
---
# agent-browser - Headless Browser for AI Agents
## When to use this skill
- Web automation and E2E testing
- Scraping data from modern web apps
- Deterministic element interaction using accessibility tree refs
- Isolated browser sessions for different agent tasks
---
## 1. Installation
```bash
npx skills add vercel-labs/agent-browser
# or
npm install -g agent-browser
agent-browser install
```
---
## 2. Core Workflow (Deterministic Interaction)
AI agents should use the snapshot + ref workflow for best results:
1. **Navigate**: `agent-browser open <url>`
2. **Snapshot**: `agent-browser snapshot -i` (Returns tree with refs like @e1, @e2)
3. **Interact**: `agent-browser click @e1` or `agent-browser fill @e2 "text"`
4. **Repeat**: Snapshot again if page changes
---
## 3. Key Commands
| Command | Description |
|---------|-------------|
| `open <url>` | Navigate to a URL |
| `snapshot` | Get accessibility tree with refs |
| `click <sel>` | Click element (by ref or CSS) |
| `fill <sel> <text>` | Clear and fill input |
| `screenshot [path]` | Take page screenshot |
| `close` | Quit browser session |
---
## 4. Advanced Features
- **Isolated Sessions**: Use `--session <name>` to isolate cookies/storage.
- **Persistent Profiles**: Use `--profile <path>` to persist login sessions.
- **Semantic Locators**: `find role button click --name "Submit"`
- **JavaScript Execution**: `eval "window.scrollTo(0, 100)"`
---
## Quick Reference
```bash
# Optimal AI Workflow
agent-browser open example.com
agent-browser snapshot -i --json
# (AI parses refs)
agent-browser click @e2
```
This skill provides a fast headless browser CLI tailored for AI agents, focusing on deterministic element selection using accessibility tree snapshots and refs (e.g., @e1). It enables isolated browser sessions, persistent profiles, and a concise command set for navigation, interaction, and inspection. The design prioritizes reproducible automation across modern web apps and single-page applications.
Agents operate with a snapshot + ref workflow: open a page, capture an accessibility-tree snapshot that returns stable refs, then invoke interactions (click, fill) by ref or CSS. Sessions can be isolated with named sessions and persistent profiles to retain logins across runs. Additional commands allow semantic queries (role/name), screenshots, and running arbitrary JavaScript for advanced control.
How do refs like @e1 stay deterministic?
Refs are generated from accessibility-tree snapshots that map stable accessibility node identities rather than volatile DOM paths, making interactions resilient to layout changes.
Can I persist login state between runs?
Yes. Use --profile <path> to store and reuse browser profile data so sessions, cookies, and local storage persist across agent runs.