home / skills / different-ai / agent-bank / browser-automation
This skill offers safe, composable browser automation workflows that click, type, and validate state changes using minimal primitives.
npx playbooks add skill different-ai/agent-bank --skill browser-automationReview the files below or copy the command above to add this skill to your agents.
---
name: browser-automation
description: Reliable, composable browser automation using minimal OpenCode Browser primitives.
license: MIT
compatibility: opencode
metadata:
audience: agents
domain: browser
---
## What I do
- Provide a safe, composable workflow for browsing tasks
- Use `browser_query` list and index selection to click reliably
- Confirm state changes after each action
## Best-practice workflow
1. Inspect tabs with `browser_get_tabs`
2. Open new tabs with `browser_open_tab` when needed
3. Navigate with `browser_navigate` if needed
4. Wait for UI using `browser_query` with `timeoutMs`
5. Discover candidates using `browser_query` with `mode=list`
6. Click, type, or select using `index`
7. Confirm using `browser_query` or `browser_snapshot`
## Selecting options
- Use `browser_select` for native `<select>` elements
- Prefer `value` or `label`; use `optionIndex` when needed
- Example: `browser_select({ selector: "select", value: "plugin" })`
## Query modes
- `text`: read visible text from a matched element
- `value`: read input values
- `list`: list many matches with text/metadata
- `exists`: check presence and count
- `page_text`: extract visible page text
## Opening tabs
- Use `browser_open_tab` to create a new tab, optionally with `url` and `active`
- Example: `browser_open_tab({ url: "https://example.com", active: false })`
## Troubleshooting
- If a selector fails, run `browser_query` with `mode=page_text` to confirm the content exists
- Use `mode=list` on broad selectors (`button`, `a`, `*[role="button"]`) and choose by index
- Confirm results after each action
This skill provides reliable, composable browser automation built on minimal OpenCode Browser primitives. It focuses on safe navigation, deterministic element selection, and explicit state confirmation after each action. The workflow is CLI-first and designed for automation tasks in finance and other data-sensitive domains.
It inspects browser state (tabs, page text, element lists) and performs actions using a small set of primitives: open tabs, navigate, query, click/type/select, and snapshot. Queries support multiple modes (text, value, list, exists, page_text) so you can discover candidates, pick by index, and confirm state changes. Each action is followed by verification to ensure reliability.
What query mode should I use to find buttons reliably?
Use browser_query with mode=list on broad selectors (button, a, *[role="button"]) and pick the correct element by index, then confirm the result.
How do I handle failing selectors or unexpected content?
Run browser_query with mode=page_text to verify visible content, widen your selector, use mode=list to inspect candidates, and confirm each action before continuing.