home / skills / harborgrid-justin / lexiflow-premium / error-recovery-and-resilience
/frontend/.github-skills/error-recovery-and-resilience
This skill helps you build resilient UI with layered error boundaries, retries, and fallback orchestration to improve user experience.
npx playbooks add skill harborgrid-justin/lexiflow-premium --skill error-recovery-and-resilienceReview the files below or copy the command above to add this skill to your agents.
---
name: error-recovery-and-resilience
description: Engineer resilient UI systems with layered error boundaries, retries, and fallback orchestration.
---
# Error Recovery and Resilience (React 18)
## Summary
Engineer resilient UI systems with layered error boundaries, retries, and fallback orchestration.
## Key Capabilities
- Define error containment strategies for nested UI regions.
- Implement retry policies with exponential backoff and jitter.
- Integrate error telemetry and automated recovery flows.
## PhD-Level Challenges
- Prove containment of error cascades across boundaries.
- Model user impact of fallback UI pathways.
- Evaluate resilience improvements with chaos testing.
## Acceptance Criteria
- Demonstrate isolated error recovery without full app reload.
- Provide telemetry of error boundaries and recovery paths.
- Include chaos-test results and mitigation strategies.
This skill teaches how to engineer resilient UI systems using layered error boundaries, retry policies, and fallback orchestration. It focuses on isolating failures so user-facing regions can recover independently without a full app reload. The guidance covers telemetry integration and chaos-testing strategies to validate improvements.
The skill inspects UI component hierarchies and recommends placement of nested error boundaries to contain failures. It defines retry policies (exponential backoff with jitter) and coordinates fallback UIs and automated recovery flows. Telemetry hooks record boundary activations, retries, and final outcomes to drive further tuning and alerting.
How granular should error boundaries be?
Start with boundaries around independently meaningful UI regions (widgets, panels) and adjust granularity based on failure patterns and telemetry.
When should I prefer fallback UI over automatic retry?
Use retries for transient failures with short-lived recoverability; show a fallback UI when retries exceed limits or when user action is required.