home / skills / dasien / claudemultiagenttemplate / error-handling
This skill helps you implement robust error handling with validation, recovery, and clear feedback to improve system stability.
npx playbooks add skill dasien/claudemultiagenttemplate --skill error-handlingReview the files below or copy the command above to add this skill to your agents.
---
name: "Error Handling Strategies"
description: "Implement robust error handling with proper validation, recovery mechanisms, and clear feedback for system stability"
category: "implementation"
required_tools: ["Read", "Write", "Edit", "Grep"]
---
# Error Handling Strategies
## Purpose
Implement robust error handling that gracefully manages failures, provides clear feedback, and maintains system stability.
## When to Use
- Writing any code that can fail
- Handling external API calls
- Processing user input
- Managing file I/O operations
- Dealing with network requests
## Key Capabilities
1. **Error Detection** - Identify potential failure points
2. **Error Recovery** - Implement fallback strategies
3. **Error Communication** - Provide clear, actionable messages
## Approach
1. Identify what can go wrong (invalid input, network failure, etc.)
2. Validate inputs before processing
3. Use try-catch or error returns appropriately
4. Provide context in error messages
5. Log errors with sufficient debugging information
6. Implement retry logic for transient failures
## Example
**Context**: File reading operation
````python
def read_config(filepath):
try:
with open(filepath, 'r') as f:
return json.load(f)
except FileNotFoundError:
logger.error(f"Config file not found: {filepath}")
return get_default_config()
except json.JSONDecodeError as e:
logger.error(f"Invalid JSON in {filepath}: {e}")
raise ConfigurationError(f"Config file is malformed at line {e.lineno}")
except PermissionError:
logger.error(f"Cannot read {filepath}: Permission denied")
raise ConfigurationError(f"Insufficient permissions for {filepath}")
````
## Best Practices
- ✅ Fail fast for programming errors
- ✅ Recover gracefully from external failures
- ✅ Include context in error messages
- ❌ Avoid: Silent failures or generic error messagesThis skill teaches practical error handling strategies to improve reliability and observability in multi-agent systems. It focuses on validation, recovery mechanisms, and clear feedback so failures are managed predictably. The goal is to maintain system stability while giving developers and users actionable information.
The skill inspects code paths for likely failure points, enforces input validation, and applies structured exception handling or error returns. It integrates logging, contextual error messages, and retry/fallback logic for transient faults. For critical failures it recommends fail-fast behavior and for external errors it prescribes graceful recovery and clear user-facing feedback.
How do I choose between retrying and failing immediately?
Retry transient failures (network blips, timeouts) with limited attempts and backoff; fail fast for deterministic programming errors or invalid inputs.
What information should I include in logs without exposing secrets?
Include operation name, input identifiers (not raw secrets), timestamps, stack traces, and correlation IDs; redact or omit sensitive fields.