Whiteboard ScaleTopicsRate Limiter

Rate Limiter

VERY COMMON

Rate limiter is asked in nearly every system design interview at FAANG. It is how Cloudflare blocks 72 billion threats per day and how Stripe enforces API quotas across millions of merchants. You will compare four algorithms, design a distributed counter in Redis, and handle the fail-open vs fail-closed trade-off that trips up most candidates.

  • Compare four rate limiting algorithms and pick the right one
  • Design atomic check-and-increment with Redis Lua scripts
  • Handle the fail-open vs fail-closed decision under pressure
GoogleMetaAmazonStripeCloudflareUber
8
Concepts
Deep dives
10
Cheat Items
Quick ref
Elevator Pitch3-minute interview summary

I would design a rate limiter for 500M DAU using the sliding window counter algorithm in Redis. Each user record takes 400 bytes (60 sub-window counters at 6 bytes each), totaling 200 GB across a Redis cluster. Every request triggers a Lua script that atomically reads, checks, and increments in under 5ms. Rules live in MySQL, synced to local cache on each gateway. The limiter sits at the API gateway so rejected requests never reach app servers. For burst tolerance, token bucket runs as a secondary algorithm. On Redis failure, the system fails open with local in-memory fallback.

Concepts Unlocked8 concepts in this topic

Token Bucket Algorithm

EASY

We chose token bucket (not leaky bucket) because it allows controlled bursts up to BB tokens while maintaining an average rate of R tokens/sec. No tokens left? 429.

Core Feature Design

Fixed Window Counter

EASY

We considered fixed window first because it costs only 20 bytes per key in Redis. We chose sliding window instead because at window boundaries, 2x the limit can pass.

Core Feature Design

Sliding Window Counter

STANDARD

We chose sliding window counter (not sliding log) because it weights two adjacent fixed windows by overlap, giving 86% less memory while eliminating the boundary burst problem.

Core Feature Design

Redis Lua Scripting

STANDARD

We chose Lua scripts (not distributed locks, not MULTI/EXEC) because they read, check, increment, and set TTL in one atomic operation. No race window, 0.1ms execution time.

High Level Design

Distributed Rate Counting

STANDARD

We use a shared Redis counter (not per-server local memory) because with 10 servers and round-robin load balancing, local counters give the user 10x the allowed rate.

High Level Design

Fail-Open vs Fail-Closed

STANDARD

We chose fail-open (not fail-closed) when Redis is down because availability beats perfect throttling. Fail-closed turns a cache blip into a full API outage.

Fault Tolerance

Rate Limit HTTP Headers

EASY

We include X-RateLimit-Remaining on all responses (not only 429) so clients self-throttle proactively. Retry-After on 429 prevents thundering herd.

System APIs

Hot Key Splitting

TRICKY

We split hot keys into 10 sub-keys (not one key per user+endpoint) because one enterprise key doing 50K RPS saturates a single Redis shard. Splitting drops shard CPU from 100% to 10%.

Fault Tolerance