STANDARDwalkthrough

Read Replica

8 of 8
2 related
Our primary MySQL handles both reads and writes. At 289K read Requests Per Second (RPS) plus 29K write RPS, it collapses because a single MySQL node tops out around 100K simple key lookups per second.
96K
requests/sec peak throughput
3
replicas for redundancy
We chose read replicas (not sharding the read path) because replicas are operationally simpler. Sharding distributes data across nodes, requiring a routing layer and cross-shard query logic.
The constraint is clear: we need to offload reads without complicating the write path.
Replicas keep full copies of the data, so any replica can serve any read. With 3 read replicas, each handles roughly 96K RPS, keeping every node within its throughput ceiling.
The primary now handles only 29K write RPS, well within its capacity. Implication: 3 replicas give us headroom up to 300K total read RPS.
Beyond that, we add more replicas linearly, each absorbing another 96K RPS. The trade-off is replication lag.
Asynchronous replication (the default in MySQL and PostgreSQL) introduces 10 to 100ms of delay between a write hitting the primary and appearing on the replica. Here is the failure scenario: a user creates a short URL and immediately clicks it.
The read request hits a replica that has not received the new row yet. The user sees a 404 for 100 to 200ms until the replica catches up.
We mitigate this two ways. First, route read-after-write queries to the primary for a short window (5 seconds after creation).
Second, warm the Redis cache on creation so the redirect serves from cache before the replica ever needs to answer. Twitter uses read replicas extensively for their URL shortener (t.co), routing billions of daily redirect lookups across replica pools while the primary handles only new link creation.
What if the interviewer asks: why not use synchronous replication to eliminate lag? Because synchronous replication blocks every write until all replicas confirm, adding 10-50ms to every write and reducing write throughput by 3-5x.
Our system is write-tolerant of brief read staleness but not tolerant of slow writes.

Formula & tradeoffs

Formula
289K RPS3 replicas96K RPS per replica\frac{289\text{K RPS}}{3 \text{ replicas}} \approx 96\text{K RPS per replica}
Why it matters in interviews
Read replicas are the most common database scaling pattern, but the replication lag trade-off trips up many candidates. Explaining the 404-on-immediate-click scenario and how to mask it with cache warming shows production awareness that interviewers value.
Related concepts