home / skills / mitsuhiko / agent-stuff / web-browser
This skill allows you to automate web page exploration by controlling a browser to navigate, interact, and extract data efficiently.
npx playbooks add skill mitsuhiko/agent-stuff --skill web-browserReview the files below or copy the command above to add this skill to your agents.
---
name: web-browser
description: "Allows to interact with web pages by performing actions such as clicking buttons, filling out forms, and navigating links. It works by remote controlling Google Chrome or Chromium browsers using the Chrome DevTools Protocol (CDP). When Claude needs to browse the web, it can use this skill to do so."
license: Stolen from Mario
---
# Web Browser Skill
Minimal CDP tools for collaborative site exploration.
## Start Chrome
```bash
./scripts/start.js # Fresh profile
./scripts/start.js --profile # Copy your profile (cookies, logins)
```
Start Chrome on `:9222` with remote debugging.
## Navigate
```bash
./scripts/nav.js https://example.com
./scripts/nav.js https://example.com --new
```
Navigate current tab or open new tab.
## Evaluate JavaScript
```bash
./scripts/eval.js 'document.title'
./scripts/eval.js 'document.querySelectorAll("a").length'
./scripts/eval.js 'JSON.stringify(Array.from(document.querySelectorAll("a")).map(a => ({ text: a.textContent.trim(), href: a.href })).filter(link => !link.href.startsWith("https://")))'
```
Execute JavaScript in active tab (async context). Be careful with string escaping, best to use single quotes.
## Screenshot
```bash
./scripts/screenshot.js
```
Screenshot current viewport, returns temp file path
## Pick Elements
```bash
./scripts/pick.js "Click the submit button"
```
Interactive element picker. Click to select, Cmd/Ctrl+Click for multi-select, Enter to finish.
## Dismiss Cookie Dialogs
```bash
./scripts/dismiss-cookies.js # Accept cookies
./scripts/dismiss-cookies.js --reject # Reject cookies (where possible)
```
Automatically dismisses EU cookie consent dialogs.
Run after navigating to a page:
```bash
./scripts/nav.js https://example.com && ./scripts/dismiss-cookies.js
```
## Background Logging (Console + Errors + Network)
Automatically started by `start.js` and writes JSONL logs to:
```
~/.cache/agent-web/logs/YYYY-MM-DD/<targetId>.jsonl
```
Manually start:
```bash
./scripts/watch.js
```
Tail latest log:
```bash
./scripts/logs-tail.js # dump current log and exit
./scripts/logs-tail.js --follow # keep following
```
Summarize network responses:
```bash
./scripts/net-summary.js
```
This skill provides a lightweight web-browser controller that remote-controls Google Chrome/Chromium via the Chrome DevTools Protocol (CDP). It enables automated navigation, element interaction, JavaScript evaluation, screenshots, and basic logging for collaborative site exploration. The tool is designed for agents to browse, inspect, and interact with pages programmatically.
The skill launches Chrome with remote debugging enabled and connects over CDP to control tabs, execute scripts, click elements, and capture screenshots. It exposes commands to navigate pages, evaluate arbitrary JavaScript in the page context, pick DOM elements interactively, dismiss common cookie dialogs, and record console/network logs. Logs and network summaries are written to JSONL for later analysis.
Do I need Chrome installed to use this skill?
Yes — the skill controls Google Chrome or Chromium via CDP, so a compatible browser must be installed and launched with remote debugging.
How are logs stored and accessed?
Background logs are saved as JSONL files in a cache directory by date and target ID; helper scripts allow tailing or summarizing network responses.