home / skills / openclaw / skills / webmcp

webmcp skill

safe

This skill helps you discover and invoke web pages tools exposed via WebMCP, enabling safe automation without DOM scraping.

npx playbooks add skill openclaw/skills --skill webmcp

Review the files below or copy the command above to add this skill to your agents.

Files (2)

SKILL.md

4.8 KB

---
name: webmcp
description: This skill should be used when browsing or automating web pages that expose tools via the WebMCP API (window.navigator.modelContext). It teaches agents how to discover, inspect, and invoke WebMCP tools on websites instead of relying on DOM scraping or UI actuation.
---

# WebMCP — Discover and Use Website Tools

## What is WebMCP

WebMCP is a browser API that lets websites expose JavaScript functions as structured tools for AI agents. Pages register tools via `window.navigator.modelContext`, each with a name, description, JSON Schema input, and an `execute` callback. Think of it as an MCP server running inside the web page itself.

Spec: https://github.com/webmachinelearning/webmcp

## Detecting WebMCP Support

Before interacting with tools, check whether the current page supports WebMCP:

```js
const supported = "modelContext" in window.navigator;
```

If `false`, the page does not expose WebMCP tools — fall back to DOM interaction or actuation.

## Discovering Available Tools

Tools are registered by the page via `provideContext()` or `registerTool()`. The browser mediates access. To list available tools from an agent's perspective, evaluate:

```js
// Browser-specific — the exact discovery API depends on the agent runtime.
// Typically the browser exposes registered tools to connected agents automatically.
// From page-script perspective, tools are registered like this:
window.navigator.modelContext.provideContext({
  tools: [
    {
      name: "tool-name",
      description: "What this tool does",
      inputSchema: { type: "object", properties: { /* ... */ }, required: [] },
      execute: (params, agent) => { /* ... */ }
    }
  ]
});
```

Key points:
- Each tool has `name`, `description`, `inputSchema` (JSON Schema), and `execute`.
- `provideContext()` replaces all previously registered tools (useful for SPA state changes).
- `registerTool()` / `unregisterTool()` add/remove individual tools without resetting.
- Tools may change as the user navigates or as SPA state updates — re-check after page transitions.

## Tool Schema Format

Tool input schemas follow JSON Schema (aligned with MCP SDK and Prompt API tool use):

```js
{
  name: "add-stamp",
  description: "Add a new stamp to the collection",
  inputSchema: {
    type: "object",
    properties: {
      name: { type: "string", description: "The name of the stamp" },
      year: { type: "number", description: "Year the stamp was issued" },
      imageUrl: { type: "string", description: "Optional image URL" }
    },
    required: ["name", "year"]
  },
  execute({ name, year, imageUrl }, agent) {
    // Implementation — updates UI and app state
    return {
      content: [{ type: "text", text: `Stamp "${name}" added.` }]
    };
  }
}
```

## Invoking Tools

When connected as an agent, send a tool call by name with parameters matching `inputSchema`. The `execute` callback runs on the page's main thread, can update the UI, and returns a structured response:

```js
// Response format from execute():
{
  content: [
    { type: "text", text: "Result description" }
  ]
}
```

- Tools run sequentially on the main thread (one at a time).
- `execute` may be async (returns a Promise).
- The second parameter `agent` provides `agent.requestUserInteraction()` for user confirmation flows.

## User Interaction During Tool Execution

Tools can request user confirmation before sensitive actions:

```js
async function buyProduct({ product_id }, agent) {
  const confirmed = await agent.requestUserInteraction(async () => {
    return confirm(`Buy product ${product_id}?`);
  });
  if (!confirmed) throw new Error("Cancelled by user.");
  executePurchase(product_id);
  return { content: [{ type: "text", text: `Product ${product_id} purchased.` }] };
}
```

Always respect user denials — do not retry cancelled tool calls.

## Agent Workflow

1. Navigate to the target website.
2. Check `"modelContext" in window.navigator` to confirm WebMCP support.
3. Discover registered tools (names, descriptions, schemas).
4. Select the appropriate tool based on the user's goal and the tool description.
5. Invoke with correct parameters matching `inputSchema`.
6. Read the structured response and relay results to the user.
7. After SPA navigation or state changes, re-discover tools — the set may have changed.
8. If no WebMCP tool fits the task, fall back to DOM-based interaction.

## Important Constraints

- **Browser context required** — tools only exist in a live browsing context (tab/webview), not headlessly.
- **Sequential execution** — tool calls run one at a time on the main thread.
- **No cross-origin tool sharing** — tools are scoped to the page that registered them.
- **Permission-gated** — the browser may prompt the user before allowing tool access.
- **Tools are dynamic** — SPAs may register/unregister tools based on UI state.

Overview

This skill teaches agents how to discover, inspect, and invoke WebMCP tools exposed by web pages via window.navigator.modelContext. It replaces brittle DOM scraping and UI actuation with structured tool discovery and typed calls. Use it when browsing sites that expose machine-readable tools and when you need reliable, intent-driven interactions with page logic.

How this skill works

The skill checks for WebMCP support by testing for modelContext on window.navigator, then lists registered tools including name, description, and JSON Schema input. Agents select a tool that matches the user goal, call it with parameters that satisfy the tool's inputSchema, and handle the structured response returned by the page. Tools execute on the page main thread, may be async, and can request user interaction via the agent parameter.

When to use it

Interacting with websites that explicitly expose window.navigator.modelContext tools
Performing actions where DOM scraping is fragile or insufficient
When you need typed inputs and predictable structured responses
During automation where the page offers tool-level capabilities (e.g., submit, create, query)
When user confirmation flows are required and the tool can prompt the user

Best practices

Always check "modelContext" in window.navigator before attempting tool discovery
Validate parameters against the tool's JSON Schema before invoking execute
Re-discover tools after SPA navigation or state changes — registration can change
Respect user interaction requests and abort if the user denies confirmation
Fall back to DOM-based interaction if no suitable WebMCP tool exists

Example use cases

Listing available site tools and choosing the best one for a user command
Calling an on-page tool to create or update app state (e.g., add item, post comment)
Triggering purchase flows that require a user confirmation prompt via agent.requestUserInteraction
Invoking a query tool that returns structured content to present to the user
Switching from scraping to tool calls for more reliable automation on SPAs

FAQ

How do I detect whether a page supports WebMCP?

Check if "modelContext" exists on window.navigator; if absent, the page does not expose WebMCP tools.

Can I call multiple tools at the same time?

No. Tool execute callbacks run sequentially on the page main thread; calls should be made one at a time.