The False Dichotomy
Most architecture discussions frame event-driven vs request-driven as a binary choice. In practice, most systems use both — the question is where to draw the boundary.
Request-Driven: When Synchrony Matters
Request-driven (RPC/REST) is the right choice when:
- The caller needs the result to proceed - Failure should be immediately visible to the caller - The operation is idempotent and fast
In our ad serving system, the auction evaluation is request-driven. The publisher is waiting for a bid response. We can't say "we'll get back to you."
Event-Driven: When Decoupling Matters
Event-driven is the right choice when:
- The producer doesn't need to know who consumes the event - The operation can tolerate latency - You want to decouple scaling of producer and consumer
Impression logging is event-driven. The serving path publishes an event to Kafka and returns immediately. The consumer handles persistence, billing aggregation, and fraud detection independently.
The Failure Mode Analysis
Request-driven failure modes: - Consumer downtime blocks the producer - Cascading failures through synchronous call chains - Latency spikes propagate upstream
Event-driven failure modes: - Consumer lag — events pile up if consumers are slow - At-least-once delivery requires idempotent consumers - Debugging is harder — no single request trace
The Decision Framework
I use this heuristic: if the answer to "what happens if the downstream is down?" is "the user sees an error," use request-driven. If the answer is "we process it later," use event-driven.
The key insight is that event-driven doesn't eliminate failure — it changes when and how you handle it.