home / skills / openai / openai-agents-python / examples-auto-run

examples-auto-run skill

safe

/.codex/skills/examples-auto-run

This skill runs Python examples in auto mode with logging, rerun support, and background control to streamline multi-agent experiments.

npx playbooks add skill openai/openai-agents-python --skill examples-auto-run

Review the files below or copy the command above to add this skill to your agents.

Files (2)

SKILL.md

3.1 KB

---
name: examples-auto-run
description: Run python examples in auto mode with logging, rerun helpers, and background control.
---

# examples-auto-run

## What it does

- Runs `uv run examples/run_examples.py` with:
  - `EXAMPLES_INTERACTIVE_MODE=auto` (auto-input/auto-approve).
  - Per-example logs under `.tmp/examples-start-logs/`.
  - Main summary log path passed via `--main-log` (also under `.tmp/examples-start-logs/`).
  - Generates a rerun list of failures at `.tmp/examples-rerun.txt` when `--write-rerun` is set.
- Provides start/stop/status/logs/tail/collect/rerun helpers via `run.sh`.
- Background option keeps the process running with a pidfile; `stop` cleans it up.

## Usage

```bash
# Start (auto mode; interactive included by default)
.codex/skills/examples-auto-run/scripts/run.sh start [extra args to run_examples.py]
# Examples:
.codex/skills/examples-auto-run/scripts/run.sh start --filter basic
.codex/skills/examples-auto-run/scripts/run.sh start --include-server --include-audio

# Check status
.codex/skills/examples-auto-run/scripts/run.sh status

# Stop running job
.codex/skills/examples-auto-run/scripts/run.sh stop

# List logs
.codex/skills/examples-auto-run/scripts/run.sh logs

# Tail latest log (or specify one)
.codex/skills/examples-auto-run/scripts/run.sh tail
.codex/skills/examples-auto-run/scripts/run.sh tail main_20260113-123000.log

# Collect rerun list from a main log (defaults to latest main_*.log)
.codex/skills/examples-auto-run/scripts/run.sh collect

# Rerun only failed entries from rerun file (auto mode)
.codex/skills/examples-auto-run/scripts/run.sh rerun
```

## Defaults (overridable via env)

- `EXAMPLES_INTERACTIVE_MODE=auto`
- `EXAMPLES_INCLUDE_INTERACTIVE=1`
- `EXAMPLES_INCLUDE_SERVER=0`
- `EXAMPLES_INCLUDE_AUDIO=0`
- `EXAMPLES_INCLUDE_EXTERNAL=0`
- Auto-approvals in auto mode: `APPLY_PATCH_AUTO_APPROVE=1`, `SHELL_AUTO_APPROVE=1`, `AUTO_APPROVE_MCP=1`

## Log locations

- Main logs: `.tmp/examples-start-logs/main_*.log`
- Per-example logs (from `run_examples.py`): `.tmp/examples-start-logs/<module_path>.log`
- Rerun list: `.tmp/examples-rerun.txt`
- Stdout logs: `.tmp/examples-start-logs/stdout_*.log`

## Notes

- The runner delegates to `uv run examples/run_examples.py`, which already writes per-example logs and supports `--collect`, `--rerun-file`, and `--print-auto-skip`.
- `start` uses `--write-rerun` so failures are captured automatically.
- If `.tmp/examples-rerun.txt` exists and is non-empty, invoking the skill with no args runs `rerun` by default.

## Behavioral validation (Codex/LLM responsibility)

The runner does not perform any automated behavioral validation. After every foreground `start` or `rerun`, **Codex must manually validate** all exit-0 entries:

1. Read the example source (and comments) to infer intended flow, tools used, and expected key outputs.
2. Open the matching per-example log under `.tmp/examples-start-logs/`.
3. Confirm the intended actions/results occurred; flag omissions or divergences.
4. Do this for **all passed examples**, not just a sample.
5. Report immediately after the run with concise citations to the exact log lines that justify the validation.

Overview

This skill runs Python examples in an automated interactive mode with integrated logging, rerun support, and background process control. It wraps uv run examples/run_examples.py with sensible defaults to auto-approve interactive prompts and capture per-example and summary logs. Helpers are provided to start, stop, inspect, tail, collect rerun lists, and rerun failures. Background mode uses a pidfile so the process can be managed cleanly.

How this skill works

The skill invokes uv run examples/run_examples.py with EXAMPLES_INTERACTIVE_MODE=auto and environment defaults that enable auto-approvals. It writes a main summary log and per-example logs under .tmp/examples-start-logs/, and can generate a failures rerun list at .tmp/examples-rerun.txt when --write-rerun is enabled. A shell helper script provides start/stop/status/logs/tail/collect/rerun commands and supports running the job in the background with pidfile management.

When to use it

Run the full example suite non-interactively while capturing detailed logs.
Automatically collect a list of failed examples for targeted reruns.
Keep long-running example runs in the background and check status later.
Tail or inspect specific per-example logs after a run to debug failures.
Run only failed examples quickly using the generated rerun file.

Best practices

Start runs via the provided run.sh helper to ensure logs and rerun files are created consistently.
Check .tmp/examples-start-logs/main_*.log first for a summary, then open per-example logs for detail.
Use --filter or include flags to limit runs to relevant subsets when iterating.
After foreground start or rerun, manually validate all exit-0 examples by reading source and corresponding logs.
Keep .tmp/examples-rerun.txt under version control or archive it between iterations if you need persistent rerun lists.

Example use cases

Nightly automated runs that auto-approve prompts and produce a summary log for review.
Local debugging: run a filtered subset and tail the per-example log for quick iteration.
CI step: run in background on a builder machine and collect logs later for artifact storage.
Failure triage: collect a rerun list, rerun only failures, and inspect per-example logs to diagnose regressions.
Regression verification: after code changes, run all examples and manually validate pass entries against logs.

FAQ

Where are the logs and rerun file stored?

Main and per-example logs go under .tmp/examples-start-logs/, stdout logs are named stdout_*.log, and the rerun list is .tmp/examples-rerun.txt.

How do I rerun only failures?

Use the helper: scripts/run.sh rerun. The rerun file is created when start is run with --write-rerun or via the collect helper.