home / skills / amnadtaowsoam / cerebraskills / load-balancing

This skill guides you through load balancing strategies to improve availability, performance, and resilience across architectures.

npx playbooks add skill amnadtaowsoam/cerebraskills --skill load-balancing

Review the files below or copy the command above to add this skill to your agents.

Files (1)
SKILL.md
3.4 KB
---
name: Load Balancing Strategies
description: Comprehensive guide to load balancing algorithms, health checks, and high-availability patterns.
---

# Load Balancing Strategies

## Overview

Load balancing distributes traffic across multiple targets to improve
availability, performance, and resilience. This guide covers algorithms,
health checks, and operational best practices.

## Table of Contents

1. [Fundamentals](#fundamentals)
2. [Layer 4 vs Layer 7](#layer-4-vs-layer-7)
3. [Algorithms](#algorithms)
4. [Health Checks](#health-checks)
5. [Session Persistence](#session-persistence)
6. [TLS Termination](#tls-termination)
7. [Connection Draining](#connection-draining)
8. [Global Load Balancing](#global-load-balancing)
9. [Cloud Load Balancers](#cloud-load-balancers)
10. [Software Load Balancers](#software-load-balancers)
11. [Service Mesh Load Balancing](#service-mesh-load-balancing)
12. [Autoscaling Integration](#autoscaling-integration)
13. [Monitoring](#monitoring)
14. [Troubleshooting](#troubleshooting)

---

## Fundamentals

Core goals:
- Spread traffic evenly
- Avoid single points of failure
- Improve latency by routing to healthy targets

## Layer 4 vs Layer 7

- **Layer 4 (TCP/UDP)**: Fast, protocol-agnostic, no HTTP awareness.
- **Layer 7 (HTTP/HTTPS)**: Route by path/host/headers, supports TLS termination.

## Algorithms

Common strategies:
- **Round Robin**: Simple rotation.
- **Weighted Round Robin**: Bias to stronger nodes.
- **Least Connections**: Route to least busy.
- **Weighted Least Connections**: Combine weight + load.
- **IP Hash**: Sticky routing by client IP.
- **Random**: Low overhead.
- **Least Response Time**: Prefer lowest latency target.

## Health Checks

Types:
- **Active**: Probes at intervals.
- **Passive**: Detect failures from live traffic.

Use both for best detection and recovery.

## Session Persistence

Sticky sessions route a client to the same target:
- Cookie-based affinity
- IP hash

Use only when state cannot be externalized.

## TLS Termination

Terminate TLS at the load balancer for:
- Centralized cert management
- Better performance
- Easier observability

Optionally re-encrypt to backend for end-to-end security.

## Connection Draining

Allow in-flight requests to finish during scale-down or deploy:
- Set drain timeout
- Stop new connections

## Global Load Balancing

GSLB routes across regions:
- Geo-based routing
- Latency-based routing
- Failover routing

## Cloud Load Balancers

- **AWS**: ALB (L7), NLB (L4)
- **GCP**: HTTP(S) Load Balancer, TCP/UDP LB
- **Azure**: Application Gateway, Azure Load Balancer

## Software Load Balancers

- **NGINX**: Popular L7 proxy with health checks.
- **HAProxy**: High performance L4/L7.
- **Envoy**: Modern proxy with rich telemetry.

## Service Mesh Load Balancing

Service meshes (Istio, Linkerd) provide client-side load balancing with
retry policies, circuit breaking, and telemetry.

## Autoscaling Integration

Combine with autoscaling:
- Scale on CPU, latency, or queue depth
- Pre-warm nodes to reduce cold starts

## Monitoring

Track:
- Request rate and latency
- Backend error rates
- Health check failures
- Uneven traffic distribution

## Troubleshooting

Common issues:
- Misconfigured health checks (false negatives)
- Sticky sessions causing hot spots
- TLS mismatch or SNI routing errors
- Draining too short for long requests

## Related Skills
- `09-microservices/api-gateway`
- `09-microservices/service-mesh`
- `15-devops-infrastructure/kubernetes-helm`

Overview

This skill is a practical guide to load balancing algorithms, health checks, and high-availability patterns. It summarizes Layer 4 vs Layer 7 tradeoffs, common balancing algorithms, session persistence, TLS termination, connection draining, global routing, and integration points with cloud and software load balancers. The focus is actionable choices to improve availability, performance, and resilience.

How this skill works

The guide inspects and compares algorithmic strategies (round robin, least connections, IP hash, weighted variants, least response time) and explains when to apply each. It explains active and passive health checks, session affinity options, TLS termination patterns, connection draining, and global load balancing approaches. Practical tooling and platform examples (NGINX, HAProxy, Envoy, AWS/GCP/Azure offerings) are included, plus monitoring and autoscaling integration advice.

When to use it

  • Distribute traffic across multiple backend instances to avoid single points of failure.
  • Route HTTP requests by host/path or use TCP-level routing for performance-sensitive services.
  • Apply session persistence only when state cannot be externalized to a shared store.
  • Use global load balancing for multi-region failover, geo-routing, or latency-based routing.
  • Integrate health checks and draining during deploys and scale-downs to maintain user experience.

Best practices

  • Combine active and passive health checks to detect failures quickly while avoiding false positives.
  • Prefer Layer 7 routing when you need path/host-based decisions or TLS termination; use Layer 4 for lower latency and protocol-agnostic traffic.
  • Externalize session state (redis, database) to avoid sticky-session hotspots unless unavoidable.
  • Set connection drain time to cover the longest expected request and stop new connections during drain.
  • Monitor request rate, latency, backend error rates, and distribution skew to detect imbalance and misconfiguration.

Example use cases

  • Web application behind an ALB with cookie-based affinity disabled and session state stored centrally in Redis.
  • API mesh using Envoy with least-connections and retry/circuit-breaker policies for service-to-service traffic.
  • Global service using latency-based GSLB with health checks per region and automatic failover.
  • Cloud-native autoscaling where new instances are pre-warmed and registered only after passing health probes.
  • Edge TLS termination at the load balancer with optional re-encryption to backends for end-to-end security.

FAQ

When should I terminate TLS at the load balancer?

Terminate TLS for centralized certificate management, improved observability, and offloading CPU work; re-encrypt to backends if end-to-end encryption is required.

How do I choose between round robin and least connections?

Use round robin for homogeneous capacity and simplicity; use least connections when instances have variable load or long-running connections.

Are sticky sessions recommended?

Avoid sticky sessions when possible—externalize session state. Use cookie affinity or IP hash only when state cannot be moved off the backend.