home / skills / inclusionai / aworld / code

code skill

safe

This skill accelerates MCP automation by generating a single Python script that batches tool calls, reducing latency and token usage.

npx playbooks add skill inclusionai/aworld --skill code

Review the files below or copy the command above to add this skill to your agents.

Files (1)

skill.md

6.7 KB

---
name: hypercode_forge
description: 🚀 HyperCode Forge - Competitive compression engine for MCP workflows
tool_list: {"terminal-server": ["execute_command"], "filesystem-server": ["read_file"], "ms-playwright": []}
active: False
---

## 🎯 What is HyperCode Forge?

HyperCode Forge is a pattern for optimizing MCP tool usage: **combine multiple MCP tool calls into a single Python script and complete them in one shot within the code execution environment**.


### 💡 Core Idea

Traditional approach (Direct Tool Calls):
```
LLM → Tool Call 1 → Result 1 → LLM → Tool Call 2 → Result 2 → LLM → ...
```
- ❌ Every tool invocation must pass through the LLM
- ❌ Intermediate results consume a large number of context tokens
- ❌ High round-trip latency and low efficiency

Code Mode approach:
```
LLM → Generate Python code → Execution environment completes all tool calls in one run → Return the final result
```
- ✅ Only one LLM interaction needed
- ✅ Intermediate results are handled in the execution environment, consuming no context
- ✅ Loops, conditionals, and other programming constructs are available
- ✅ Token usage drops by 98.7% (per Anthropic case study)


## 🧬 HyperCode Forge Pattern

HyperCode Forge distills the whole “write code to drive tools” mindset into a repeatable playbook that compresses the time, tokens, and human back-and-forth needed to win automation tasks.
If you select this pattern, You need collection full information about MCP tool call params and generate code. Such you can use browser see the webpage structure and next use this pattern fill the form.

With HyperCode Forge, you can:
- **Compression mindset**: Collapse scattered tool invocations into a single script so the agent spends once, executes once, and delivers once.
- **Adaptive batching**: Use loops, conditionals, and local caching to process large datasets without round-tripping through the LLM.
- **Execution-native debugging**: Address intermediate issues inside the Python runtime, keeping noisy logs away from the LLM context window.
- **Strategic elasticity**: Scale from five tool calls to five hundred by changing loop parameters instead of rewriting prompts.

When a competitor still handholds each MCP call, HyperCode Forge already sealed the outcome with a compact, reproducible code artifact.

## 📝 Suitable Scenarios

### ✅ Highly suitable scenarios

1. **Multi-step MCP tool invocations**
   - Require invoking several tools in sequence
   - Intermediate results are large (documents, datasets, etc.)
   - Example: Cross-system data synchronization, batch operations

2. **Data filtering and transformation**
   - Retrieve large volumes of data from a source
   - Need filtering, aggregation, or transformation
   - Example: Extracting rows meeting certain conditions from a 10,000-row spreadsheet

3. **Loops and branching logic**
   - Need to poll and wait for a status
   - Need to iterate through a list to perform operations
   - Example: Waiting for deployment completion notifications, batch updates to records

4. **Form filling and automation**
   - Need to populate multiple fields on the same page
   - Steps are clear and predictable
   - Example: Booking systems, registration forms

### ❌ Unsuitable scenarios
- Need to adjust strategy in real time based on each step’s outcome
- Steps are highly uncertain
- Single, simple tool calls
- Search By Google、Baidu、 etc..

## Notes
1. Code Mode supports Python scripts only; other languages are not supported.
2. Ensure the generated code is based on the latest data.

## 📊 Efficiency Comparison

### Traditional approach example
```
Task: Read meeting notes from Google Drive and add them to Salesforce

Step 1: TOOL CALL gdrive.getDocument(documentId: "abc123")
        → Returns 50,000-token meeting notes (loaded into context)
        
Step 2: TOOL CALL salesforce.updateRecord(...)
        → Requires writing the 50,000 tokens back in
        
Total: ~150,000 tokens
```

### Code Mode approach
```python
# Generated code
import gdrive
import salesforce

# Processed in the execution environment without consuming LLM context
transcript = gdrive.getDocument(documentId="abc123")
salesforce.updateRecord(
    objectType="SalesMeeting",
    recordId="00Q5f000001abcXYZ",
    data={"Notes": transcript}
)
print("✅ Salesforce record updated")

Total: ~2,000 tokens (code only)
Savings: 98.7%
```

## How to call MCP (Playwright example)

```python

import asyncio

from aworld.sandbox import Sandbox

mcp_servers = ["ms-playwright"]
mcp_config = {
    "mcpServers": {
        "ms-playwright": {
            "command": "npx",
            "args": [
                "@playwright/mcp@latest",
                "--no-sandbox",
                "--cdp-endpoint=http://localhost:9222"
            ],
            "env": {
                "PLAYWRIGHT_TIMEOUT": "120000",
                "SESSION_REQUEST_CONNECT_TIMEOUT": "120",
            },
        }
    }
}


async def call_ctrip_flight():
    sandbox = Sandbox(
        mcp_servers=mcp_servers,
        mcp_config=mcp_config,
    )

    result = await sandbox.mcpservers.call_tool([
        {
            "tool_name": "ms-playwright",
            "action_name": "browser_click",
            "params": {"element": "International · Hong Kong/Macau/Taiwan flights option (国际·港澳台机票选项)", "ref": "e92"}
        }
    ])
    print(f"browser_click -> {result}")


    result = await sandbox.mcpservers.call_tool([
        {
            "tool_name": "ms-playwright",
            "action_name": "browser_click",
            "params": {"element": "One-way option (单程选项)", "ref": "e327"}
        }
    ])
    print(f"browser_click one-way option -> {result}")


    result = await sandbox.mcpservers.call_tool([
        {
            "tool_name": "ms-playwright",
            "action_name": "browser_click",
            "params": {"element": "Destination input field (目的地输入框)", "ref": "e344"}
        }
    ])
    print(f"browser_click destination input -> {result}")


    result = await sandbox.mcpservers.call_tool([
        {
            "tool_name": "ms-playwright",
            "action_name": "browser_click",
            "params": {"element": "Enter country/region/city/airport (输入国家/地区/城市/机场)", "ref": "e340"}
        }
    ])



if __name__ == '__main__':
    asyncio.run(call_ctrip_flight())
```

- **HTTP headers**: When generating Python network requests, add default headers such as `User-Agent` and timeouts to avoid server rejection.
  ```python
  import urllib.request

  url = "xxx"
  req = urllib.request.Request(
      url,
      headers={"User-Agent": "Mozilla/5.0"}
  )
  with urllib.request.urlopen(req, timeout=30) as resp:
      content = resp.read()
  ```

Overview

This skill implements the HyperCode Forge pattern: compress multiple MCP tool interactions into a single Python script that runs in the execution environment. It reduces LLM round-trips, dramatically lowers token usage, and centralizes logic, loops, and error handling in runnable code. Use it to orchestrate complex, repeatable multi-step MCP workflows with predictable outcomes.

How this skill works

The agent collects full MCP tool call parameters, generates a single Python script that performs all required tool calls, and executes that script in the code runtime. Intermediate results remain local to the runtime so they do not consume LLM context tokens. The pattern supports batching, branching, retries, and native debugging inside the execution environment.

When to use it

When a task requires multiple sequential or parallel MCP tool calls that produce large intermediate results.
When you need loops, conditional logic, polling, or local caching to process large datasets.
When you want to minimize token usage and reduce LLM interactions for large-scale automation.
When form-filling or web automation involves predictable, repeatable steps across many items.
When you need an auditable, reproducible code artifact that executes the whole workflow once.

Best practices

Collect complete tool parameters before generating code so the script runs without further LLM prompts.
Add sensible HTTP headers, timeouts, and error handling for network calls to avoid flaky runs.
Use local caching and batching to scale from a few to hundreds of calls by changing loop parameters.
Keep logs and intermediate debugging inside the runtime to preserve LLM context for outcomes only.
Validate outputs and return a concise final result rather than streaming verbose intermediate data.

Example use cases

Batch ETL: pull large datasets from a source, filter and transform rows, and push results to a target in one run.
Cross-system sync: read meeting notes from cloud storage and update CRM records without loading transcripts into the LLM.
Web automation: fill multi-field forms or perform batched bookings using Playwright MCP calls in a single script.
Polling workflows: wait for deployment statuses and perform follow-up actions using loop and sleep constructs.
Bulk updates: apply conditional updates across hundreds of records using adaptive batching and retries.

FAQ

Does this support languages other than Python?

No. Code Mode for this pattern supports Python scripts only; other languages are not supported.

When is HyperCode Forge not appropriate?

Avoid it when steps require human-in-the-loop decisions or frequent strategy changes based on each step’s outcome.