EASYwalkthrough

Rate Limit Headers

5 of 8

2 related

Our rate limiter works on the server, but clients have no visibility into how close they are to the limit. Without feedback, they slam into HTTP 429 Too Many Requests errors and retry blindly, making the overload worse.

We solve this by returning a set of standard response headers on every response. X-RateLimit-Limit tells the client the maximum requests allowed per window (e.g., 5000). X-RateLimit-Remaining shows how many requests are left in the current window. X-RateLimit-Reset is a Unix timestamp indicating when the window resets. When a client is throttled, the Retry-After header tells them exactly how many seconds to wait.

“The constraint: server-side enforcement alone creates a one-sided protocol where clients must guess.”

GitHub returns all four headers on every API response, so well-behaved clients pace themselves and never hit 429 at all. Stripe goes further by returning a machine-readable error body with a rate limit object containing the same data plus the specific limit name.

Why do these headers exist in our design? Because without them, client developers resort to exponential backoff guessing games, and poorly written clients retry immediately in tight loops, creating a retry storm that amplifies the original overload by 3x to 5x.

Implication: rate limiting without headers is an enforcement mechanism; rate limiting with headers becomes a cooperative protocol that reduces total traffic.

Why it matters in interviews

Rate limiting is not a server-side-only concern. The headers turn it into a cooperative protocol. Interviewers want to hear us mention 429, Retry-After, and the retry storm failure mode that happens without proper client feedback.

Related concepts

← PreviousFixed Window Counter Next →Distributed Rate Limiting