home / skills / gigaverse-app / skillet / pythonista-async

pythonista-async skill

safe

This skill helps you write robust async code with safe_gather, proper timeouts, and clean cancellation for concurrent tasks.

npx playbooks add skill gigaverse-app/skillet --skill pythonista-async

Review the files below or copy the command above to add this skill to your agents.

Files (2)

SKILL.md

3.1 KB

---
name: pythonista-async
description: Use when writing async code, using asyncio.gather, handling concurrent operations. Triggers on "asyncio", "async", "await", "gather", "concurrent", "parallel", "fail-fast", "timeout", "race condition", "cancellation", "coroutine", "task", "CancelledError", or when writing async functions.
---

# Async Patterns

## Core Rules

1. **Prefer `safe_gather`** for all new async code - fail-fast, timeout support, cleaner cancellation
2. **Use `return_exceptions=True`** for partial-results patterns
3. **Always consider timeout** for long-running operations
4. **Don't migrate** existing cleanup code (low priority, both work fine)

## safe_gather vs asyncio.gather

**`safe_gather` is better in ALL cases** because it provides:
- Fail-fast cancellation (when not using `return_exceptions=True`)
- Timeout support with automatic cleanup
- Cleaner cancellation handling

See [references/safe-gather.md](references/safe-gather.md) for implementation.

## Pattern: All Tasks Must Succeed (fail-fast)

```python
# Initialization - all workers must start
await safe_gather(*[worker.pre_run() for worker in workers])

# Data fetching - need all pieces
channel_info, front_row, participants = await safe_gather(
    fetch_channel_info(channel_id),
    fetch_front_row(channel_id),
    fetch_participants(channel_id),
)

# With timeout
results = await safe_gather(*tasks, timeout=30.0)
```

## Pattern: Partial Results Acceptable

```python
# Use safe_gather with return_exceptions=True
results = await safe_gather(*batch_tasks, return_exceptions=True)
for result in results:
    if isinstance(result, Exception):
        logger.error(f"Task failed: {result}")
    else:
        process(result)

# With timeout
results = await safe_gather(*batch_tasks, return_exceptions=True, timeout=30.0)
```

## Pattern: Cleanup/Shutdown

```python
# With timeout for cleanup (don't wait forever)
await safe_gather(
    service1.shutdown(),
    service2.shutdown(),
    return_exceptions=True,
    timeout=10.0
)

# OK to keep existing asyncio.gather for cleanup
await asyncio.gather(*cancelled_tasks, return_exceptions=True)
```

## Migration Decision Tree

```
Is this new code?
├─ Yes -> Use safe_gather
└─ No (existing code)
   └─ Is it cleanup with return_exceptions=True?
      ├─ Yes -> Keep asyncio.gather (optional to migrate)
      └─ No -> Would fail-fast or timeout help?
         ├─ Yes -> Migrate to safe_gather
         └─ No -> Low priority
```

## Key Principles

1. **Fail-fast by default**: If one task fails, cancel the rest
2. **Always consider timeout**: Long-running operations need timeouts
3. **Clean cancellation**: Handle CancelledError properly
4. **Don't wait forever**: Especially for cleanup/shutdown

## Reference Files

- [references/safe-gather.md](references/safe-gather.md) - safe_gather implementation

## Related Skills

- [/pythonista-testing](../pythonista-testing/SKILL.md) - Testing async code
- [/pythonista-typing](../pythonista-typing/SKILL.md) - Async type annotations
- [/pythonista-debugging](../pythonista-debugging/SKILL.md) - Debugging async issues

Overview

This skill helps you write robust asyncio code by promoting safe concurrency patterns and a drop-in safe_gather helper. It focuses on fail-fast behavior, timeout handling, and clean cancellation to avoid common pitfalls like leaked tasks and hung shutdowns. Use it when writing new async code or deciding whether to migrate existing gather usage.

How this skill works

The skill recommends using safe_gather instead of asyncio.gather for new code because safe_gather supports fail-fast cancellation, timeouts, and cleaner cancellation semantics. For partial-results workflows it shows using return_exceptions=True. It also provides explicit patterns for initialization, parallel data fetching, batch processing, and cleanup/shutdown with concrete timeout guidance.

When to use it

When composing multiple coroutines that should cancel on first failure (fail-fast).
When running batches where some results may fail and you want partial results (use return_exceptions=True).
When adding timeouts to long-running or network-bound concurrent operations.
When implementing shutdown/cleanup tasks and you need bounded wait time.
When deciding whether to migrate existing asyncio.gather usage.

Best practices

Prefer safe_gather for all new async code to get fail-fast, timeout, and cleanup benefits.
Use return_exceptions=True when you need partial results and will handle individual failures.
Always consider and set a sensible timeout for long-running operations, including shutdown.
Handle CancelledError explicitly in coroutines that perform cleanup to ensure resources are released.
Keep existing asyncio.gather for legacy cleanup code only if it already uses return_exceptions=True.

Example use cases

Start multiple worker services and require all to initialize successfully before proceeding (fail-fast).
Fetch several independent pieces of data in parallel and combine them, cancelling all if one fetch fails.
Process a batch of user requests where some tasks may fail but you still want to collect successes (return_exceptions=True).
Run shutdown hooks for multiple services with an overall timeout to avoid blocking process exit.
Migrate critical concurrent operations to safe_gather while leaving low-priority cleanup code unchanged.

FAQ

When should I still use asyncio.gather?

Only keep asyncio.gather for existing cleanup code that already uses return_exceptions=True and where migration is low priority. For new code prefer safe_gather.

How do I handle timeouts with safe_gather?

Pass a timeout value to safe_gather; it will cancel outstanding tasks and perform automatic cleanup so you won't hang waiting indefinitely.

What if I want partial results but also timeouts?

Use safe_gather(..., return_exceptions=True, timeout=...) and inspect results for Exception instances to log or retry individual failures.