CDN (Content Delivery Network)

3 of 8

3 related

A viewer in Tokyo requests a video chunk hosted in our Virginia origin data center. Without a CDN, the round trip crosses the Pacific Ocean, adding 150 to 200ms of latency per chunk.

The constraint: we cannot move our origin closer to every viewer, and we cannot replicate the full video catalog everywhere. We chose a Content Delivery Network that caches chunks at edge Points of Presence (POPs) worldwide, reducing latency from 200ms to under 20ms by serving from the nearest POP.

“At 2-second chunks, that delay compounds into visible buffering.”

When a viewer requests a chunk, DNS routes them to the closest POP. On a cache hit, the chunk is served directly from edge.

On a miss, the POP performs an origin pull, fetches the chunk, serves it, and caches it for subsequent viewers. YouTube operates over 1,000 POPs globally and achieves a cache hit ratio above 95% for popular content.

Implication: 95% cache hit ratio means only 5% of requests reach our origin, reducing origin bandwidth requirements by 20x. Netflix takes a different approach with Open Connect: instead of origin-pull, they pre-position content on ISP-hosted appliances during off-peak hours, achieving near-100% cache hit rates.

We chose the pull model (not Netflix's push model) because push requires negotiating hardware placement with every ISP, which is only feasible at Netflix's scale with dedicated partnerships. Trade-off: we accept occasional origin-pull latency on cache misses in exchange for operational simplicity and no ISP dependencies.

Long-tail content with fewer than 10 views per day is often not worth caching because the storage cost at the edge exceeds the bandwidth savings. What if the interviewer asks: what about live streaming where content does not exist yet to cache?

For live streams, we use edge ingest where the streamer pushes to the nearest POP, and the CDN distributes chunks to other POPs within 1 to 2 seconds via internal backbone networks.

Why it matters in interviews

Interviewers want to hear us distinguish between origin pull and push models and justify which one we chose. Mentioning the long-tail eviction trade-off shows we think about cost, not only performance.

Related concepts

← PreviousVideo Transcoding Pipeline Next →Video Chunking