STANDARDwalkthrough

CDN Edge Caching with Track Prefetch

2 of 8
3 related
A listener in Mumbai presses play on a track hosted at our US-East origin. Without a CDN, each byte-range request crosses two oceans, adding 250ms of latency per request.
We chose CDN edge caching with pull-based origin fetch (not Netflix-style push pre-positioning) because music catalog access follows extreme power-law distribution: the top 1% of tracks (1M songs) account for over 80% of all plays. A track that charts on the Billboard Hot 100 gets played millions of times.
The constraint: we need sub-200ms time to first audio byte for 300M daily users across 180 countries.
This repetition drives a 99%+ cache hit ratio, far higher than video streaming's 95%, because a 3-minute song is replayed thousands of times while a 2-hour movie is typically watched once. Each edge Point of Presence (POP) caches the hot catalog: 1M tracks×10 MB avg=10 TB per POP1\text{M tracks} \times 10\text{ MB avg} = 10\text{ TB per POP}.
The player also performs track prefetch: when a user is 30 seconds from the end of the current track, the client requests the first 256 KB of the next 2 to 3 tracks in the queue from the nearest POP. This hides network latency entirely for sequential listening.
We chose pull-based caching (not push) because push requires predicting which tracks will trend at each POP, and music trends vary dramatically by region. Trade-off: we accept occasional cold-start latency on the first play of a new release in exchange for zero wasted edge storage on tracks nobody at that POP listens to.
What if the interviewer asks: what happens when a major artist drops an album and every POP gets hammered simultaneously? We use request coalescing: if 10,000 listeners at the same POP request the same new track within a 1-second window, the POP sends one origin fetch and serves all 10,000 from that single pull.
Why it matters in interviews
Interviewers probe why music achieves 99% CDN cache hit ratio versus video's 95%. Explaining the replay frequency difference and the 10 TB per POP sizing shows we understand the content delivery economics unique to audio.
Related concepts