home / skills / giuseppe-trisciuoglio / developer-kit / spring-boot-resilience4j

This skill helps you implement fault tolerance in Spring Boot apps using Resilience4j by adding circuit breakers, retries, rate limiters, bulkheads, and

npx playbooks add skill giuseppe-trisciuoglio/developer-kit --skill spring-boot-resilience4j

Review the files below or copy the command above to add this skill to your agents.

Files (4)
SKILL.md
15.2 KB
---
name: spring-boot-resilience4j
description: This skill should be used when implementing fault tolerance and resilience patterns in Spring Boot applications using the Resilience4j library. Apply this skill to add circuit breaker, retry, rate limiter, bulkhead, time limiter, and fallback mechanisms to prevent cascading failures, handle transient errors, and manage external service dependencies gracefully in microservices architectures.
allowed-tools: Read, Write, Edit, Bash
category: backend
tags: [spring-boot, resilience4j, circuit-breaker, fault-tolerance, retry, bulkhead, rate-limiter]
version: 1.1.0
---

# Spring Boot Resilience4j Patterns

## Overview

Resilience4j is a lightweight fault tolerance library designed for Java 8+ and functional programming. It provides patterns for handling failures in distributed systems including circuit breakers, rate limiters, retry mechanisms, bulkheads, and time limiters. This skill demonstrates how to integrate Resilience4j with Spring Boot 3.x to build resilient microservices that can gracefully handle external service failures and prevent cascading failures across the system.

## When to Use

To implement resilience patterns in Spring Boot applications, use this skill when:
- Preventing cascading failures from external service unavailability with circuit breaker pattern
- Retrying transient failures with exponential backoff
- Rate limiting to protect services from overload or downstream service capacity constraints
- Isolating resources with bulkhead pattern to prevent thread pool exhaustion
- Adding timeout controls to async operations with time limiter
- Combining multiple patterns for comprehensive fault tolerance

Resilience4j is a lightweight, composable library for adding fault tolerance without requiring external infrastructure. It provides annotation-based patterns that integrate seamlessly with Spring Boot's AOP and Actuator.

## Instructions

### 1. Setup and Dependencies

Add Resilience4j dependencies to your project. For Maven, add to `pom.xml`:

```xml
<dependency>
    <groupId>io.github.resilience4j</groupId>
    <artifactId>resilience4j-spring-boot3</artifactId>
    <version>2.2.0</version> // Use latest stable version
</dependency>
<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-aop</artifactId>
</dependency>
<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
```

For Gradle, add to `build.gradle`:

```gradle
implementation "io.github.resilience4j:resilience4j-spring-boot3:2.2.0"
implementation "org.springframework.boot:spring-boot-starter-aop"
implementation "org.springframework.boot:spring-boot-starter-actuator"
```

Enable AOP annotation processing with `@EnableAspectJAutoProxy` (auto-configured by Spring Boot).

### 2. Circuit Breaker Pattern

Apply `@CircuitBreaker` annotation to methods calling external services:

```java
@Service
public class PaymentService {
    private final RestTemplate restTemplate;

    public PaymentService(RestTemplate restTemplate) {
        this.restTemplate = restTemplate;
    }

    @CircuitBreaker(name = "paymentService", fallbackMethod = "paymentFallback")
    public PaymentResponse processPayment(PaymentRequest request) {
        return restTemplate.postForObject("http://payment-api/process",
            request, PaymentResponse.class);
    }

    private PaymentResponse paymentFallback(PaymentRequest request, Exception ex) {
        return PaymentResponse.builder()
            .status("PENDING")
            .message("Service temporarily unavailable")
            .build();
    }
}
```

Configure in `application.yml`:

```yaml
resilience4j:
  circuitbreaker:
    configs:
      default:
        registerHealthIndicator: true
        slidingWindowSize: 10
        minimumNumberOfCalls: 5
        failureRateThreshold: 50
        waitDurationInOpenState: 10s
    instances:
      paymentService:
        baseConfig: default
```

See @references/configuration-reference.md for complete circuit breaker configuration options.

### 3. Retry Pattern

Apply `@Retry` annotation for transient failure recovery:

```java
@Service
public class ProductService {
    private final RestTemplate restTemplate;

    public ProductService(RestTemplate restTemplate) {
        this.restTemplate = restTemplate;
    }

    @Retry(name = "productService", fallbackMethod = "getProductFallback")
    public Product getProduct(Long productId) {
        return restTemplate.getForObject(
            "http://product-api/products/" + productId,
            Product.class);
    }

    private Product getProductFallback(Long productId, Exception ex) {
        return Product.builder()
            .id(productId)
            .name("Unavailable")
            .available(false)
            .build();
    }
}
```

Configure retry in `application.yml`:

```yaml
resilience4j:
  retry:
    configs:
      default:
        maxAttempts: 3
        waitDuration: 500ms
        enableExponentialBackoff: true
        exponentialBackoffMultiplier: 2
    instances:
      productService:
        baseConfig: default
        maxAttempts: 5
```

See @references/configuration-reference.md for retry exception configuration.

### 4. Rate Limiter Pattern

Apply `@RateLimiter` to control request rates:

```java
@Service
public class NotificationService {
    private final EmailClient emailClient;

    public NotificationService(EmailClient emailClient) {
        this.emailClient = emailClient;
    }

    @RateLimiter(name = "notificationService",
        fallbackMethod = "rateLimitFallback")
    public void sendEmail(EmailRequest request) {
        emailClient.send(request);
    }

    private void rateLimitFallback(EmailRequest request, Exception ex) {
        throw new RateLimitExceededException(
            "Too many requests. Please try again later.");
    }
}
```

Configure in `application.yml`:

```yaml
resilience4j:
  ratelimiter:
    configs:
      default:
        registerHealthIndicator: true
        limitForPeriod: 10
        limitRefreshPeriod: 1s
        timeoutDuration: 500ms
    instances:
      notificationService:
        baseConfig: default
        limitForPeriod: 5
```

### 5. Bulkhead Pattern

Apply `@Bulkhead` to isolate resources. Use `type = SEMAPHORE` for synchronous methods:

```java
@Service
public class ReportService {
    private final ReportGenerator reportGenerator;

    public ReportService(ReportGenerator reportGenerator) {
        this.reportGenerator = reportGenerator;
    }

    @Bulkhead(name = "reportService", type = Bulkhead.Type.SEMAPHORE)
    public Report generateReport(ReportRequest request) {
        return reportGenerator.generate(request);
    }
}
```

Use `type = THREADPOOL` for async/CompletableFuture methods:

```java
@Service
public class AnalyticsService {
    @Bulkhead(name = "analyticsService", type = Bulkhead.Type.THREADPOOL)
    public CompletableFuture<AnalyticsResult> runAnalytics(
            AnalyticsRequest request) {
        return CompletableFuture.supplyAsync(() ->
            analyticsEngine.analyze(request));
    }
}
```

Configure in `application.yml`:

```yaml
resilience4j:
  bulkhead:
    configs:
      default:
        maxConcurrentCalls: 10
        maxWaitDuration: 100ms
    instances:
      reportService:
        baseConfig: default
        maxConcurrentCalls: 5

  thread-pool-bulkhead:
    instances:
      analyticsService:
        maxThreadPoolSize: 8
```

### 6. Time Limiter Pattern

Apply `@TimeLimiter` to async methods to enforce timeout boundaries:

```java
@Service
public class SearchService {
    @TimeLimiter(name = "searchService", fallbackMethod = "searchFallback")
    public CompletableFuture<SearchResults> search(SearchQuery query) {
        return CompletableFuture.supplyAsync(() ->
            searchEngine.executeSearch(query));
    }

    private CompletableFuture<SearchResults> searchFallback(
            SearchQuery query, Exception ex) {
        return CompletableFuture.completedFuture(
            SearchResults.empty("Search timed out"));
    }
}
```

Configure in `application.yml`:

```yaml
resilience4j:
  timelimiter:
    configs:
      default:
        timeoutDuration: 2s
        cancelRunningFuture: true
    instances:
      searchService:
        baseConfig: default
        timeoutDuration: 3s
```

### 7. Combining Multiple Patterns

Stack multiple patterns on a single method for comprehensive fault tolerance:

```java
@Service
public class OrderService {
    @CircuitBreaker(name = "orderService")
    @Retry(name = "orderService")
    @RateLimiter(name = "orderService")
    @Bulkhead(name = "orderService")
    public Order createOrder(OrderRequest request) {
        return orderClient.createOrder(request);
    }
}
```

Execution order: Retry → CircuitBreaker → RateLimiter → Bulkhead → Method

All patterns should reference the same named configuration instance for consistency.

### 8. Exception Handling and Monitoring

Create a global exception handler using `@RestControllerAdvice`:

```java
@RestControllerAdvice
public class ResilienceExceptionHandler {

    @ExceptionHandler(CallNotPermittedException.class)
    @ResponseStatus(HttpStatus.SERVICE_UNAVAILABLE)
    public ErrorResponse handleCircuitOpen(CallNotPermittedException ex) {
        return new ErrorResponse("SERVICE_UNAVAILABLE",
            "Service currently unavailable");
    }

    @ExceptionHandler(RequestNotPermitted.class)
    @ResponseStatus(HttpStatus.TOO_MANY_REQUESTS)
    public ErrorResponse handleRateLimited(RequestNotPermitted ex) {
        return new ErrorResponse("TOO_MANY_REQUESTS",
            "Rate limit exceeded");
    }

    @ExceptionHandler(BulkheadFullException.class)
    @ResponseStatus(HttpStatus.SERVICE_UNAVAILABLE)
    public ErrorResponse handleBulkheadFull(BulkheadFullException ex) {
        return new ErrorResponse("CAPACITY_EXCEEDED",
            "Service at capacity");
    }
}
```

Enable Actuator endpoints for monitoring resilience patterns in `application.yml`:

```yaml
management:
  endpoints:
    web:
      exposure:
        include: health,metrics,circuitbreakers,retries,ratelimiters
  endpoint:
    health:
      show-details: always
  health:
    circuitbreakers:
      enabled: true
    ratelimiters:
      enabled: true
```

Access monitoring endpoints:
- `GET /actuator/health` - Overall health including resilience patterns
- `GET /actuator/circuitbreakers` - Circuit breaker states
- `GET /actuator/metrics` - Custom resilience metrics

## Best Practices

- **Always provide fallback methods**: Ensure graceful degradation with meaningful responses rather than exceptions
- **Use exponential backoff for retries**: Prevent overwhelming recovering services with aggressive backoff (`exponentialBackoffMultiplier: 2`)
- **Choose appropriate failure thresholds**: Set `failureRateThreshold` between 50-70% depending on acceptable error rates
- **Use constructor injection exclusively**: Never use field injection for Resilience4j dependencies
- **Enable health indicators**: Set `registerHealthIndicator: true` for all patterns to integrate with Spring Boot health
- **Separate failure vs. client errors**: Retry only transient errors (network timeouts, 5xx); skip 4xx and business exceptions
- **Size bulkheads based on load**: Calculate thread pool and semaphore sizes from expected concurrent load and latency
- **Monitor and adjust**: Continuously review metrics and adjust timeouts/thresholds based on production behavior
- **Document fallback behavior**: Make fallback logic clear and predictable to users and maintainers

## Constraints and Warnings

- Fallback methods must have the same signature as the original method plus an optional exception parameter.
- Circuit breaker state is maintained per-instance; ensure proper bean scoping in multi-tenant scenarios.
- Retry operations should be idempotent as they may execute multiple times.
- Do not use circuit breakers for operations that must always complete; use appropriate timeouts instead.
- Rate limiters can cause thread blocking; configure appropriate wait durations.
- Bulkhead isolation may lead to rejected requests under load; ensure proper fallback handling.
- Be cautious with `@Retry` on non-idempotent operations like POST requests.
- Monitor memory usage when using thread pool bulkheads with high concurrency settings.

## Examples

### Input: External Service Call Without Resilience

```java
@Service
public class PaymentService {
    public PaymentResponse processPayment(PaymentRequest request) {
        return restTemplate.postForObject("http://payment-api/process",
            request, PaymentResponse.class);
    }
}
```

### Output: Circuit Breaker Protected Service

```java
@Service
public class PaymentService {
    @CircuitBreaker(name = "paymentService", fallbackMethod = "paymentFallback")
    public PaymentResponse processPayment(PaymentRequest request) {
        return restTemplate.postForObject("http://payment-api/process",
            request, PaymentResponse.class);
    }

    private PaymentResponse paymentFallback(PaymentRequest request, Exception ex) {
        return PaymentResponse.builder()
            .status("PENDING")
            .message("Service temporarily unavailable")
            .build();
    }
}
```

### Input: Service Without Retry

```java
public Order getOrder(Long orderId) {
    return orderRepository.findById(orderId)
        .orElseThrow(() -> new OrderNotFoundException(orderId));
}
```

### Output: Retry with Exponential Backoff

```java
@Retry(name = "orderService", fallbackMethod = "getOrderFallback")
public Order getOrder(Long orderId) {
    return orderRepository.findById(orderId)
        .orElseThrow(() -> new OrderNotFoundException(orderId));
}

private Order getOrderFallback(Long orderId, Exception ex) {
    return Order.cachedOrder(orderId);
}
```

### Input: Unbounded Rate

```java
@RestController
public class ApiController {
    @GetMapping("/api/data")
    public Data fetchData() {
        return dataService.processLargeDataset();
    }
}
```

### Output: Rate Limited Endpoint

```java
@RestController
public class ApiController {
    @RateLimiter(name = "dataService", fallbackMethod = "rateLimitFallback")
    @GetMapping("/api/data")
    public Data fetchData() {
        return dataService.processLargeDataset();
    }

    private ResponseEntity<ErrorResponse> rateLimitFallback(Exception ex) {
        return ResponseEntity.status(429)
            .body(new ErrorResponse("TOO_MANY_REQUESTS", "Rate limit exceeded"));
    }
}
```

### Input: Blocking Thread Pool Operation

```java
@Service
public class ReportService {
    public Report generateReport(ReportRequest request) {
        return reportGenerator.generate(request);
    }
}
```

### Output: Bulkhead Protected Service

```java
@Service
public class ReportService {
    @Bulkhead(name = "reportService", type = Bulkhead.Type.SEMAPHORE)
    public Report generateReport(ReportRequest request) {
        return reportGenerator.generate(request);
    }
}
```

- [Complete property reference and configuration patterns](references/configuration-reference.md)
- [Unit and integration testing strategies](references/testing-patterns.md)
- [Real-world e-commerce service example using all patterns](references/examples.md)
- [Resilience4j Documentation](https://resilience4j.readme.io/)
- [Spring Boot Actuator Skill](../spring-boot-actuator/SKILL.md) - Monitoring resilience patterns with Actuator

Overview

This skill shows how to implement fault tolerance and resilience patterns in Spring Boot applications using Resilience4j. It explains where to apply circuit breaker, retry, rate limiter, bulkhead, time limiter, and fallback mechanisms so microservices handle external failures gracefully. The content focuses on practical configuration, annotations, and monitoring tips for Spring Boot 3.x.

How this skill works

The skill integrates Resilience4j annotations (@CircuitBreaker, @Retry, @RateLimiter, @Bulkhead, @TimeLimiter) with Spring Boot AOP to wrap service calls and enforce policies. Configuration is provided via application.yml to define named instances and shared configs. Fallback methods, Actuator endpoints and a global exception handler provide graceful degradation and observability.

When to use it

  • Protect downstream HTTP or RPC calls from cascading failures with a circuit breaker
  • Automatically retry transient errors with exponential backoff for network hiccups and 5xx responses
  • Limit request rates to avoid overwhelming downstream services or your own endpoints
  • Isolate resources using bulkheads to prevent thread pool exhaustion or shared resource saturation
  • Enforce timeouts on async processing with time limiters to bound latency
  • Combine patterns on critical flows (retry + circuit breaker + bulkhead + rate limiter) for comprehensive resilience

Best practices

  • Always supply meaningful fallback methods to return predictable degraded responses
  • Retry only idempotent or safe operations; prefer exponential backoff to reduce strain on recovering services
  • Tune failure thresholds and window sizes based on production traffic and acceptable error rates
  • Use constructor injection for services and clients; avoid field injection for testability
  • Enable health indicators and Actuator endpoints to monitor circuit breakers, retries and rate limiters
  • Document fallback behavior and ensure monitoring alerts reflect degraded service modes

Example use cases

  • Wrap payment or billing client calls with a circuit breaker and fallback that queues requests for later processing
  • Apply @Retry with exponential backoff to product catalog reads that occasionally return transient 503 errors
  • Rate limit notification or email endpoints to protect third-party providers from burst traffic
  • Use thread-pool bulkhead for heavy analytics jobs to isolate CPU-bound tasks from request handling threads
  • Add @TimeLimiter to search endpoints backed by slow third-party search engines and return an empty result fallback

FAQ

How should I choose thresholds for a circuit breaker?

Start with conservative defaults (slidingWindowSize 10, failureRateThreshold 50%) and adjust using production metrics. Tune according to error patterns and acceptable failure modes.

Can I stack multiple Resilience4j annotations on one method?

Yes. Common stacks are Retry → CircuitBreaker → RateLimiter → Bulkhead. Reference the same named configuration for consistency and ensure order matches expected behavior.