What Backpressure Actually Means
Backpressure is the mechanism by which a system signals to its upstream that it's overwhelmed. Without it, a traffic spike causes unbounded queue growth, memory exhaustion, and eventually a crash.
The goal isn't to handle infinite traffic — it's to degrade gracefully and recover quickly.
Pattern 1: Token Bucket Rate Limiting
The token bucket algorithm allows bursts while enforcing an average rate. Tokens are added at a fixed rate. Each request consumes a token. If no tokens are available, the request is rejected.
limiter := rate.NewLimiter(rate.Limit(1000), 100) // 1000 req/s, burst of 100
if !limiter.Allow() {
http.Error(w, "rate limit exceeded", http.StatusTooManyRequests)
return
}
This is the right choice for external-facing APIs where you want to protect against individual clients overwhelming the system.
Pattern 2: Semaphore-Based Concurrency Limiting
Rate limiting controls requests per second. Concurrency limiting controls simultaneous in-flight requests. For CPU-bound or DB-bound operations, concurrency limiting is often more effective.
sem := make(chan struct{}, 100) // max 100 concurrent requests
sem <- struct{}{}
defer func() { <-sem }()
Pattern 3: Load Shedding
When the system is overloaded, it's better to reject some requests immediately than to queue them all and serve them slowly. A slow response is often worse than a fast rejection — it ties up client connections and downstream resources.
We implemented load shedding based on queue depth: if the worker queue exceeds a threshold, new requests get a 503 immediately.
Pattern 4: Circuit Breaker
For downstream dependencies, a circuit breaker prevents cascading failures. If a downstream service starts failing, the circuit opens and requests fail fast instead of waiting for timeouts.
The Combination That Works
In our ad serving system, we use all four in layers:
1. Token bucket at the edge (per-publisher rate limit) 2. Concurrency semaphore at the auction evaluator (protects Redis) 3. Load shedding at the impression logger (protects Kafka) 4. Circuit breaker on the DSP bid request (protects against slow DSPs)
Each layer handles a different failure mode. Together, they make the system resilient to traffic spikes, slow dependencies, and cascading failures.