home / skills / cexll / myclaude / browser

browser skill

/skills/browser

This skill automates browser tasks using Chrome DevTools Protocol to launch Chrome, navigate pages, run scripts, capture screenshots, and select elements.

npx playbooks add skill cexll/myclaude --skill browser

Review the files below or copy the command above to add this skill to your agents.

Files (9)
SKILL.md
1.9 KB
---
name: browser
description: This skill should be used for browser automation tasks using Chrome DevTools Protocol (CDP). Triggers when users need to launch Chrome with remote debugging, navigate pages, execute JavaScript in browser context, capture screenshots, or interactively select DOM elements. No MCP server required.
---

# Browser Automation

Minimal Chrome DevTools Protocol (CDP) helpers for browser automation without MCP server setup.

## Setup

Install dependencies before first use:

```bash
npm install --prefix ~/.claude/skills/browser/browser ws
```

## Scripts

All scripts connect to Chrome on `localhost:9222`.

### start.js - Launch Chrome

```bash
scripts/start.js              # Fresh profile
scripts/start.js --profile    # Use persistent profile (keeps cookies/auth)
```

### nav.js - Navigate

```bash
scripts/nav.js https://example.com        # Navigate current tab
scripts/nav.js https://example.com --new  # Open in new tab
```

### eval.js - Execute JavaScript

```bash
scripts/eval.js 'document.title'
scripts/eval.js '(() => { const x = 1; return x + 1; })()'
```

Use single expressions or IIFE for multiple statements.

### screenshot.js - Capture Screenshot

```bash
scripts/screenshot.js
```

Returns `{ path, filename }` of saved PNG in temp directory.

### pick.js - Visual Element Picker

```bash
scripts/pick.js "Click the submit button"
```

Returns element metadata: tag, id, classes, text, href, selector, rect.

## Workflow

1. Launch Chrome: `scripts/start.js --profile` for authenticated sessions
2. Navigate: `scripts/nav.js <url>`
3. Inspect: `scripts/eval.js 'document.querySelector(...)'`
4. Capture: `scripts/screenshot.js` or `scripts/pick.js`
5. Return gathered data

## Key Points

- All operations run locally - credentials never leave the machine
- Use `--profile` flag to preserve cookies and auth tokens
- Scripts return structured JSON for agent consumption

Overview

This skill automates Chrome via the Chrome DevTools Protocol (CDP) for local browser tasks like launching Chrome with remote debugging, navigating pages, running in-browser JavaScript, capturing screenshots, and selecting DOM elements visually. It is designed to run entirely on the local machine with no MCP server required and returns structured JSON suited for multi-agent workflows. Use it when you need reliable, scriptable browser interactions with preserved sessions via a persistent profile.

How this skill works

The skill starts a Chrome instance listening on localhost:9222 and connects over WebSocket to CDP. It exposes small scripts to start Chrome (fresh or persistent profile), navigate tabs, evaluate expressions in page context, capture screenshots, and run a visual element picker that returns DOM metadata. Each script communicates with Chrome directly, runs the requested action, and returns structured JSON describing results such as file paths, element selectors, or evaluation values.

When to use it

  • Automate site navigation and data retrieval where real browser rendering is required
  • Run JavaScript in page context to read or mutate DOM, cookies, or localStorage
  • Capture screenshots of pages or specific states for reporting and visual regression
  • Select UI elements interactively to obtain robust CSS selectors and element metadata
  • Preserve authenticated sessions using a persistent profile to access protected content

Best practices

  • Start Chrome with --profile when you need to reuse cookies or auth tokens between runs
  • Use single expressions or IIFE when passing multi-statement code to the eval script to ensure correct return values
  • Keep Chrome and the skill dependencies up to date to avoid CDP protocol mismatches
  • Validate returned selectors and rects from the picker in a headless or CI environment before automating clicks
  • Store screenshots from the returned temp path immediately if you need long-term archival

Example use cases

  • Log into a web app once with --profile and automate authenticated scraping or form submissions
  • Run eval.js to extract page metadata (title, meta tags, or computed values) and feed those into downstream agents
  • Use screenshot.js to capture visual evidence for bug reports or QA checks during CI runs
  • Use pick.js to identify a stable selector for an automated click or fill action in a follow-up script
  • Open multiple tabs with nav.js --new to parallelize inspection of pages before aggregating results

FAQ

Do credentials ever leave my machine when using this skill?

No. All operations connect to a local Chrome instance on localhost:9222 and run entirely on your machine; credentials and cookies are not transmitted externally.

How do I preserve login sessions between runs?

Launch Chrome with the --profile flag to use a persistent profile directory that retains cookies, localStorage, and other auth tokens.

What format do scripts return results in?

Scripts return structured JSON objects containing values such as evaluation results, element metadata (tag, id, classes, selector, rect), and screenshot {path, filename} for easy agent consumption.