home / skills / krishagel / geoffrey / browser-control

browser-control skill

/skills/browser-control

This skill enables authenticated browser automation with Playwright to access, extract, and interact with complex travel sites.

npx playbooks add skill krishagel/geoffrey --skill browser-control

Review the files below or copy the command above to add this skill to your agents.

Files (11)
SKILL.md
4.4 KB
---
name: browser-control
description: Full browser control for authenticated web interactions using Playwright scripts
triggers:
  - "check availability"
  - "search for"
  - "log into"
  - "browse to"
  - "look up prices"
  - "check points"
  - "find deals"
  - "scrape"
  - "get current price"
  - "check hotel"
  - "check flight"
allowed-tools: Bash, Read
version: 0.1.0
---

# Browser Control Skill

Full browser automation for travel research requiring authentication or complex interactions.

## When to Activate

Use this skill when you need to:
- Access authenticated pages (Marriott, Alaska Airlines accounts)
- Check real-time availability and prices
- Scrape forum threads (FlyerTalk, Reddit)
- Interact with JavaScript-heavy travel sites
- Fill forms or perform searches on websites

## Architecture

**Script-based approach** - No MCP overhead. Scripts load only when needed.

### Prerequisites

1. **Geoffrey Chrome Profile** must be running with remote debugging:
   ```bash
   ./scripts/launch-chrome.sh
   ```

2. **Profile must have logins saved** for:
   - Marriott Bonvoy
   - Alaska Airlines Mileage Plan
   - FlyerTalk
   - TripAdvisor
   - Reddit

## Available Scripts

All scripts are in `./scripts/` and use Playwright connecting via CDP.

| Script | Purpose | Usage |
|--------|---------|-------|
| `launch-chrome.sh` | Start Geoffrey Chrome profile | `./scripts/launch-chrome.sh` |
| `navigate.js` | Navigate to URL and get page content | `bun scripts/navigate.js <url>` |
| `screenshot.js` | Take screenshot of page | `bun scripts/screenshot.js <url> [output] [--full]` |
| `extract.js` | Extract text/data from page | `bun scripts/extract.js <url> <selector> [--all]` |
| `interact.js` | Click, type, select on page | `bun scripts/interact.js <url> <action> <selector> [value]` |
| `search.js` | Search travel sites | `bun scripts/search.js <site> <query>` |

## Usage Examples

### Check Marriott Points Availability
```bash
# Navigate to Marriott search
bun scripts/navigate.js "https://www.marriott.com/search/default.mi"

# Or use the search script
bun scripts/search.js marriott "Westin Rusutsu February 2026"
```

### Get FlyerTalk Thread Content
```bash
bun scripts/extract.js "https://www.flyertalk.com/forum/thread-url" ".post-content"
```

### Screenshot Hotel Page
```bash
bun scripts/screenshot.js "https://www.marriott.com/hotels/travel/ctswi-the-westin-rusutsu-resort/" rusutsu.png
```

## Screenshot Protection & Lazy-Loading

**Auto-Resize Protection (ALL screenshots):**
- Post-capture resize using Sharp to max 7500px per dimension
- Maintains aspect ratio, prevents Claude Code API crashes
- Every screenshot guaranteed `safeToRead: true`

**Lazy-Loading Limitation (AirBnB, dynamic sites):**
- Sites with lazy-loading show grey placeholders in fullPage mode
- Images only load when scrolled into viewport
- **Solution**: Use viewport screenshots (no --full flag) or `screenshot-current.js`

```bash
# For lazy-loading sites, screenshot current viewport
bun scripts/screenshot-current.js /tmp/output.png

# Or navigate + viewport screenshot
bun scripts/screenshot.js "https://airbnb.com/..." /tmp/output.png
```

Example output:
```json
{
  "success": true,
  "url": "https://example.com",
  "title": "Example Page",
  "screenshot": "/tmp/screenshot.png",
  "dimensions": { "width": 1920, "height": 1080 },
  "originalDimensions": { "width": 1920, "height": 1080 },
  "scaled": false,
  "safeToRead": true,
  "timestamp": "2025-11-28T..."
}
```

## Connection Details

Scripts connect to Chrome via Chrome DevTools Protocol (CDP):
- **URL**: `http://127.0.0.1:9222`
- **Profile**: `~/.chrome-geoffrey`

## Error Handling

If scripts fail to connect:
1. Ensure Chrome is running with `./scripts/launch-chrome.sh`
2. Check port 9222 is not in use: `lsof -i :9222`
3. Kill existing Chrome debugger: `pkill -f "remote-debugging-port"`

## Output Format

All scripts return JSON:
```json
{
  "success": true,
  "url": "https://example.com",
  "title": "Page Title",
  "content": "Extracted content or action result",
  "timestamp": "2025-11-22T..."
}
```

## Limitations

- Requires Geoffrey Chrome profile to be running
- Cannot bypass CAPTCHAs (uses real browser fingerprint to avoid most)
- Heavy sites may be slow
- Some sites block automation despite real browser

## Future Enhancements

- Add cookie/session export for headless runs
- 1Password CLI integration for credential rotation
- Parallel page operations
- Browser-Use (Python) for complex visual tasks

Overview

This skill provides full browser control for authenticated web interactions using Playwright scripts that connect to a running Chrome profile. It’s built for travel research and other tasks that require login, JavaScript-heavy pages, form automation, or reliable screenshots. Scripts return structured JSON and operate via the Chrome DevTools Protocol at http://127.0.0.1:9222.

How this skill works

Scripts run on demand and connect to a local Chrome profile (Geoffrey) via CDP, reusing saved logins and real browser state. Available scripts perform navigation, extraction, interaction, searching, and screenshots; all outputs are JSON and screenshots are post-processed to prevent oversized images. The approach avoids persistent overhead by loading only the required script per job.

When to use it

  • Access authenticated pages (airline, hotel, forum accounts) that require saved credentials.
  • Check real-time availability, pricing, or seat/inventory data behind logins.
  • Scrape or extract content from JavaScript-heavy sites and forum threads.
  • Automate form fills, searches, clicks, and complex multi-step interactions.
  • Capture reliable screenshots while avoiding API crashes from huge images.

Best practices

  • Ensure the Geoffrey Chrome profile is launched with remote debugging before running scripts.
  • Keep required logins saved in the profile for the target sites you will access.
  • Use viewport screenshots for lazy-loading sites or use the provided current-viewport screenshot script.
  • Check JSON output for success and use timestamps to correlate results.
  • Monitor port 9222 and kill stale debugger processes if connection fails.

Example use cases

  • Log in and check Marriott room availability or points-eligible rates.
  • Extract all posts from a FlyerTalk thread or Reddit discussion for research.
  • Take a safe-to-read screenshot of a hotel listing that won’t exceed API size limits.
  • Search travel sites programmatically for a specific hotel and date combination.
  • Automate multi-step booking lookups that require clicking, selecting, and form submission.

FAQ

What must be running before using the scripts?

Start the Geoffrey Chrome profile with remote debugging enabled (scripts/launch-chrome.sh) so Playwright can connect via CDP on port 9222.

Can this bypass CAPTCHAs or sites that block automation?

No. It uses a real browser fingerprint to reduce friction but cannot reliably bypass CAPTCHAs or advanced bot blocks.