Work / Case Studies
JoulesLabs · 2024
Redesigning an Ad Server for 50k RPS
p99 Latency
120ms
18ms
Throughput
8k RPS
50k RPS
Over-delivery
4.2%
0.3%
Infra Cost
baseline
-35%
Problem
A legacy PHP ad server was failing under load — 8k RPS ceiling, 120ms p99 latency, and unreliable budget pacing causing 4%+ over-delivery. The business needed 6x throughput headroom within 6 months.
Architecture
Replaced the monolithic PHP serving path with a Go pipeline. Redis handles real-time budget state via atomic Lua scripts. Kafka decouples impression logging from the hot path. Horizontal scaling via Kubernetes HPA.
[ Architecture Diagram ]
Technical Challenges
- 01Race conditions in budget decrement under concurrent auction evaluation — solved with Redis Lua atomic scripts
- 02Hot-key contention on high-spend advertisers — solved by sharding budget keys across Redis slots
- 03Zero-downtime migration with live traffic — solved with feature-flag-based traffic splitting at the load balancer
- 04Eventual consistency in impression counting — accepted bounded inconsistency with a reconciliation job
Leadership & Influence
- Authored the RFC and got buy-in from product, finance, and infrastructure teams
- Led a team of 3 engineers through a 6-month rewrite with zero production incidents
- Defined Go coding standards and review guidelines adopted across the engineering org
- Introduced the observability stack (Prometheus + Grafana) as part of this project