home / skills / yonatangross / orchestkit / browser-tools
This skill automates and analyzes web browsing tasks, capturing content and extracting structured data across pages using Playwright and Puppeteer.
npx playbooks add skill yonatangross/orchestkit --skill browser-toolsReview the files below or copy the command above to add this skill to your agents.
---
name: browser-tools
license: MIT
compatibility: "Claude Code 2.1.34+. Requires network access."
description: OrchestKit orchestration wrapper for browser automation. Adds security rules, rate limiting, and ethical scraping guardrails on top of the upstream agent-browser skill. Use when automating browser workflows, capturing web content, or extracting structured data from web pages.
tags: [browser, automation, playwright, puppeteer, scraping, content-capture]
context: fork
agent: web-research-analyst
version: 2.1.0
author: OrchestKit
user-invocable: false
complexity: medium
metadata:
category: mcp-enhancement
---
# Browser Tools
OrchestKit orchestration wrapper for browser automation. Delegates command documentation to the upstream `agent-browser` skill and adds security rules, rate limiting, and ethical scraping guardrails.
## Decision Tree
```bash
# Fallback decision tree for web content
# 1. Try WebFetch first (fast, no browser overhead)
# 2. If empty/partial -> Try Tavily extract/crawl
# 3. If SPA or interactive -> use agent-browser
# 4. If login required -> authentication flow + state save
# 5. If dynamic -> wait @element or wait --text
```
## Security Rules (4 rules)
This skill enforces 4 security and ethics rules in `rules/`:
| Category | Rules | Priority |
|----------|-------|----------|
| Ethics & Security | `browser-scraping-ethics.md`, `browser-auth-security.md` | CRITICAL |
| Reliability | `browser-rate-limiting.md`, `browser-snapshot-workflow.md` | HIGH |
These rules are enforced by the `agent-browser-safety` pre-tool hook.
## Anti-Patterns (FORBIDDEN)
```bash
# Automation
agent-browser fill @e2 "hardcoded-password" # Never hardcode credentials
agent-browser open "$UNVALIDATED_URL" # Always validate URLs
# Scraping
# Crawling without checking robots.txt
# No delay between requests (hammering servers)
# Ignoring rate limit responses (429)
# Content capture
agent-browser get text body # Prefer targeted ref extraction
# Trusting page content without validation
# Not waiting for SPA hydration before extraction
# Session management
# Storing auth state in code repositories
# Not cleaning up state files after use
```
## Related Skills
- `agent-browser` (upstream) - Full command reference and usage patterns
- `web-research-workflow` - Unified decision tree for web research
- `testing-patterns` - Comprehensive testing patterns including E2E and webapp testing
- `api-design` - API design patterns for endpoints discovered during scraping
This skill provides production-ready browser automation and content-capture patterns for Playwright, Puppeteer, and the agent-browser CLI. It packages proven workflows for automating interactions, extracting structured data from dynamic SPAs, handling authentication, and running respectful multi-page crawls. Use it to reliably capture text, HTML, screenshots, and validated JSON from modern web apps.
The skill exposes concise command and code patterns to open pages, discover interactive elements via snapshots, run targeted extraction by element refs, and persist session state for authenticated flows. It includes strategies for SPA hydration, pagination and recursive crawls, anti-bot handling (rate limits, robots.txt, CAPTCHA guidance), and converting scraped content into clean markdown or JSON. Recommended fallbacks and decision trees guide when to use a fast fetch vs a full browser run.
When should I prefer agent-browser vs direct Playwright/Puppeteer?
Use agent-browser for fast, CLI-driven workflows and snapshot-based element discovery. Use Playwright/Puppeteer directly for deep custom scripting or advanced API access.
How do I handle pages protected by OAuth or SSO?
Run the flow in headed mode, complete interactive steps, then save session state to reuse in headless captures. Avoid hardcoding credentials and clean up state files after use.