home / skills / bdambrosio / cognitive_workbench / osworld-execute

This skill executes Python code in the OSWorld environment and returns execution results including status, duration, and I/O streams.

npx playbooks add skill bdambrosio/cognitive_workbench --skill osworld-execute

Review the files below or copy the command above to add this skill to your agents.

Files (2)
Skill.md
2.1 KB
---
name: osworld-execute
type: python
description: "Execute Python code (typically pyautogui commands) in the OSWorld environment. Returns execution result with success status, return code, duration, stdout, and stderr."
schema_hint:
  value: "string (Python code)"
  python: "string (Python code, alternative to value)"
  return_observation: "bool (default: false)"
  out: "$variable"
examples:
  - '{"type":"osworld-execute","python":"pyautogui.click(100,200)","out":"$result"}'
  - '{"type":"osworld-execute","value":"pyautogui.typewrite(\"hello\")","return_observation":true,"out":"$result"}'
---

# OSWorld Execute Tool (Level 4)

## Input
- `python` or `value`: Python code string (required) - typically pyautogui commands
- `return_observation`: bool (default: false) - include observation in response
- `value` parameter can be used as alternative to `python`

## Output
- Note ID (bound to `out` variable) containing:
  - `text`: formatted execution result
  - `format`: "text"
  - `metadata`: execution data including:
    - `success`: boolean - whether execution succeeded
    - `returncode`: integer - return code (0 = success)
    - `duration_ms`: integer - execution duration in milliseconds
    - `step_counter`: integer - step counter after execution
    - `stdout`: string - standard output
    - `stderr`: string - standard error
    - `python_code`: string - the executed code
    - `observation`: dict (if return_observation=true) - observation after execution
    - `timestamp`: float (if return_observation=true) - observation timestamp

## Configuration
- `OSWORLD_URL` environment variable (defaults to `http://localhost:3002`)
- Or pass `osworld_url` in character config's `osworld_config` section

## Common Workflow
```json
{"type":"osworld-observe","out":"$obs"}
{"type":"osworld-execute","python":"pyautogui.click(100,200)","out":"$result"}
{"type":"osworld-observe","out":"$obs2"}
```

## Notes
- Python code is executed directly in the OSWorld environment
- Common commands: `pyautogui.click(x, y)`, `pyautogui.typewrite(text)`, `pyautogui.press(key)`
- No retries or corrections - Jill owns error handling
- Execution is synchronous and blocking

Overview

This skill executes Python code inside an OSWorld runtime, typically to run pyautogui commands that control a virtual desktop. It returns a structured execution result that includes success status, return code, duration, stdout, and stderr. Use it when you need deterministic, synchronous control over the OSWorld environment from an agent.

How this skill works

You send a Python code string (or use the value parameter) to be executed directly in the OSWorld environment. The tool runs the code synchronously and returns a note containing execution metadata: success boolean, return code, duration in ms, stdout, stderr, and the executed Python code. Optionally it can include an observation snapshot and timestamp when return_observation is true.

When to use it

  • Trigger GUI actions in OSWorld using pyautogui (click, typewrite, press).
  • Run short, deterministic Python snippets that must complete before the next step.
  • Collect precise execution diagnostics (stdout, stderr, return code, duration).
  • Verify side effects in the environment by pairing execute with observe.
  • Automate UI test steps or scripted interactions in the OSWorld instance.

Best practices

  • Keep code snippets small and focused; execution is synchronous and blocking.
  • Capture and handle errors locally—this tool does not perform retries or corrections.
  • Use return_observation=true when you need the post-execution environment snapshot.
  • Log or persist stdout/stderr and return code for debugging complex flows.
  • Prefer explicit pyautogui timing and waits to avoid race conditions in the UI.

Example use cases

  • Click a button and then observe the screen to confirm the expected UI change.
  • Type a string into a text field: pyautogui.click(x,y); pyautogui.typewrite('text').
  • Press a sequence of keys to navigate menus and capture the resulting stdout.
  • Run a short diagnostic script, get duration_ms and stderr to debug flakiness.
  • Automate a test step that must finish before the agent continues.

FAQ

How do I provide the Python code?

Pass the code as the python parameter or use value as an alias; the string is executed directly in OSWorld.

Can I get the environment observation after execution?

Yes. Set return_observation to true to include observation and its timestamp in the result metadata.