home / skills / iamzhihuix / happy-claude-skills / browser
This skill automates browser tasks using minimal CDP tools to start Chrome, navigate pages, run scripts, take screenshots, and extract DOM data.
npx playbooks add skill iamzhihuix/happy-claude-skills --skill browserReview the files below or copy the command above to add this skill to your agents.
---
name: browser
description: Minimal Chrome DevTools Protocol tools for browser automation and scraping. Use when you need to start Chrome, navigate pages, execute JavaScript, take screenshots, or interactively pick DOM elements. Triggers include "browse website", "scrape page", "take screenshot", "automate browser", "extract DOM", "web scraping".
---
# Browser Tools
Minimal CDP tools for collaborative site exploration and scraping.
**Credits**: Based on [Mario Zechner](https://mariozechner.at)'s article [What if you don't need MCP?](https://mariozechner.at/posts/2025-11-02-what-if-you-dont-need-mcp/), adapted from [Factory.ai](https://docs.factory.ai/guides/skills/browser).
## Setup
Before first use, install dependencies:
```bash
npm install --prefix skills/browser
```
## Start Chrome
```bash
./skills/browser/scripts/start.js # Fresh profile
./skills/browser/scripts/start.js --profile # Copy your profile (cookies, logins)
```
Start Chrome on `:9222` with remote debugging.
## Navigate
```bash
./skills/browser/scripts/nav.js https://example.com
./skills/browser/scripts/nav.js https://example.com --new
```
Navigate current tab or open new tab.
## Evaluate JavaScript
```bash
./skills/browser/scripts/eval.js 'document.title'
./skills/browser/scripts/eval.js 'document.querySelectorAll("a").length'
```
Execute JavaScript in active tab (async context).
**IMPORTANT**: The code must be a single expression or use IIFE for multiple statements:
- Single expression: `'document.title'`
- Multiple statements: `'(() => { const x = 1; return x + 1; })()'`
- Avoid newlines in the code string - keep it on one line
## Screenshot
```bash
./skills/browser/scripts/screenshot.js
```
Screenshot current viewport, returns temp file path.
## Pick Elements
```bash
./skills/browser/scripts/pick.js "Click the submit button"
```
Interactive element picker. Click to select, Cmd/Ctrl+Click for multi-select, Enter to finish.
## Workflow
1. **Start Chrome** with `start.js --profile` to mirror your authenticated state.
2. **Drive navigation** via `nav.js https://target.app` or open secondary tabs with `--new`.
3. **Inspect the DOM** using `eval.js` for quick counts, attribute checks, or extracting JSON payloads.
4. **Capture artifacts** with `screenshot.js` for visual proof or `pick.js` when you need precise selectors or text snapshots.
## Usage Notes
- Start Chrome first before using other tools
- The `--profile` flag syncs your actual Chrome profile so you're logged in everywhere
- JavaScript evaluation runs in an async context in the page
- Pick tool allows you to visually select DOM elements by clicking on them
This skill provides minimal Chrome DevTools Protocol (CDP) tools for browser automation, interactive DOM picking, scraping, and screenshots. It is designed for quick setup and hands-on exploration when you need to start Chrome with remote debugging, run page JavaScript, or capture page artifacts. Use it to drive navigation, extract content, and collect visual evidence with a lightweight CLI interface.
The tools start a Chrome instance with remote debugging enabled and connect via CDP to control tabs. You can navigate pages, evaluate JavaScript expressions in the page context, take viewport screenshots, and interactively pick DOM elements by clicking on the page. JavaScript evaluation runs in an async page context and supports single-expression or IIFE-style code passed as a one-line string.
How should I format JavaScript passed to the eval tool?
Provide a single expression or wrap multiple statements in an immediately-invoked function expression (IIFE) on one line, e.g. '(() => { const x = 1; return x + 1; })()'.
What does the --profile flag do when starting Chrome?
It launches Chrome with a copy of your existing profile so cookies, logins, and other session state are available in the started browser.
Can I pick multiple elements with the picker?
Yes. Use Cmd/Ctrl+Click to multi-select elements and press Enter to finish and return the selection.