home / skills / yuniorglez / gemini-elite-core / stagehand-expert
This skill orchestrates browser automation and direct CDP communication to accelerate autonomous agents with stagehand v3 for 2026.
npx playbooks add skill yuniorglez/gemini-elite-core --skill stagehand-expertReview the files below or copy the command above to add this skill to your agents.
---
name: stagehand-expert
id: stagehand-expert
version: 4.1.0
description: "Master Architect in Stagehand V3. Expert in Direct CDP Automation, Decision Caching, and Agentic Web Orchestration for 2026."
---
# 🎭 Skill: Stagehand Expert (v4.1.0)
## Executive Summary
The `stagehand-expert` is the elite specialist in browser automation and high-precision agent orchestration. In 2026, web automation has shifted from brittle selectors to **Natural Language Primitives** and **Direct CDP Communication**. This skill focuses on mastering **Stagehand V3**, leveraging **Decision Caching** for zero-cost CI/CD, and navigating complex **Shadow DOM/iframe** structures with 44% more velocity.
---
## 📋 Table of Contents
1. [Proactive Investigation Protocol](#proactive-investigation-protocol)
2. [The "Do Not" List (Anti-Patterns)](#the-do-not-list-anti-patterns)
3. [Core Primitives (Act, Extract, Observe)](#core-primitives-act-extract-observe)
4. [Direct CDP & Performance](#direct-cdp--performance)
5. [Advanced Agent Caching](#advanced-agent-caching)
6. [Autonomous Agents (CUA)](#autonomous-agents-cua)
7. [Reference Library](#reference-library)
---
## 🔍 Proactive Investigation Protocol
Before writing a single test, the expert MUST perform a **Deep Discovery**:
1. **Route Mapping**: identify the user flow from `page.tsx` or router configs.
2. **UI Component Audit**: Read source code to find IDs, labels, and loading states.
3. **Vibe Check**: Measure layout stability using the CDP "Vibe Score."
4. **Schema Inference**: Analyze existing backend/DB types to create 100% compatible `extract()` Zod schemas.
---
## 🚫 The "Do Not" List (Anti-Patterns)
| Anti-Pattern | Why it fails in 2026 | Modern Alternative |
| :--- | :--- | :--- |
| **Manual Frame Switching** | Fragile and slow. | Use **DeepLocator (>>) & CDP**. |
| **Hardcoded Wait(2000)** | Unreliable and causes jank. | Use **`domSettleTimeout`**. |
| **Missing finally { close() }**| Leaves zombie processes. | **Mandatory `try...finally`**. |
| **LLM Calls in CI** | Slow and expensive. | Use **Persistent Decision Caches**. |
| **Ignoring CSS Animations** | Interactions fail during transitions. | Use **Reanimated-aware Waiters**. |
---
## ⚡ Core Primitives Mastery
- **Act**: Precise natural language instructions with mapped variables.
- **Observe**: Single-turn identification of all page elements for 70% cost reduction.
- **Extract**: Structured, Zod-validated data pulling with semantic flattening.
---
## 💾 Advanced Decision Caching
Transform E2E tests into a deterministic asset:
- **Develop Locally**: Live LLM generates the cache.
- **Commit Cache**: Store DOM snapshots and results in Git.
- **Zero-Cost CI**: Run tests in "Cached-Only" mode.
*See [References: Agent Caching](./references/advanced-agent-caching.md) for details.*
---
## 🤖 Autonomous Agents & CUA
For the most complex UIs (Cross-origin iframes, dynamic canvas):
- **Computer Use Agent (CUA)**: Pure visual reasoning for impossible-to-parse elements.
- **Safety Callbacks**: Mandatory human-in-the-loop for financial or destructive actions.
---
## 📖 Reference Library
Detailed deep-dives into Stagehand Excellence:
- [**Direct CDP Communication**](./references/cdp-direct-communication.md): Velocity and deep access.
- [**Agent Caching**](./references/advanced-agent-caching.md): Determinism and cost savings.
- [**Shadow DOM Mastery**](./references/shadow-dom-iframe-mastery.md): Jumping документ boundaries.
- [**Installation & Setup**](./references/setup-guide-v3.md): The Bun/Playwright stack.
---
*Updated: January 22, 2026 - 21:20*
This skill is the Stagehand Expert: a master architect for Stagehand V3 focused on high-velocity browser automation, Direct CDP communication, and deterministic Decision Caching. It packages tactical primitives and protocols that convert flaky end-to-end suites into fast, reproducible agentic workflows. The emphasis is on Natural Language Primitives, robust Shadow DOM/iframe handling, and zero-cost CI via cached decisions.
The skill inspects application routing, UI component code, and backend schemas to build reliable interaction maps before any test runs. It uses Direct CDP calls and DeepLocator primitives to interact with frames and shadow roots, applies Observe/Act/Extract primitives for precise actions and structured data extraction, and produces decision caches (DOM snapshots + outcomes) that can be committed and replayed in CI. Safety callbacks and visual agents handle edge cases like cross-origin iframes or dynamic canvases.
How does Decision Caching reduce CI costs?
Decision Caching stores DOM snapshots and action outcomes generated locally; CI runs in Cached-Only mode and replays results without live LLM calls, eliminating per-run inference costs.
When should I use a Computer Use Agent (CUA)?
Use a CUA for elements that defy DOM parsing—dynamic canvases, visual-only controls, or cross-origin frames—paired with human-in-the-loop safety callbacks for destructive actions.