Music Streaming Platform
COMMONMusic streaming is asked at Spotify, Apple, and Amazon interviews because it tests adaptive bitrate delivery, CDN economics, and financial-grade event pipelines. It is how Spotify delivers 100 million tracks to 300 million daily users. You will design byte-range audio streaming (not HLS segments, because audio files are 1,000x smaller than video), a dual-buffer gapless playback engine, and a play count pipeline where every 30-second play triggers an exactly-once royalty payment across 12 billion daily events.
- Design byte-range audio streaming with 5-tier adaptive bitrate and 99% CDN cache hit ratio
- Build a dual-buffer gapless playback engine with sub-50ms crossfade
- Architect an exactly-once play counting pipeline for 12B daily royalty events
Visual Solutions
Step-by-step animated walkthroughs with capacity estimation, API design, database schema, and failure modes built in.
Cheat Sheet
Key concepts, trade-offs, and quick-reference notes for Music Streaming. Everything you need at a glance.
Anti-Patterns
Common design mistakes candidates make. Wrong approaches vs correct approaches for each trap.
Failure Modes
What breaks in production, how to detect it, and how to fix it. Detection metrics, mitigations, and severity ratings.
Start simple. Build to staff-level.
“I would design a music streaming platform for 300M DAU delivering 139K concurrent streams from a catalog of 100M tracks. Each track is encoded at 5 quality tiers (24-320 kbps OGG Vorbis), totaling 1.75 PB of catalog storage. CDN edge caching achieves 99%+ hit ratio because music is replayed thousands of times, unlike video. The player uses a dual-buffer architecture for gapless playback and prefetches the next 2-3 tracks. Play events (12 billion per day) flow through Kafka with exactly-once semantics because every 30-second play triggers a royalty payment. Recommendations combine ALS collaborative filtering with audio-feature CNN for cold-start tracks, cached at 480 GB in Redis.”
Audio Codec Selection and Adaptive Bitrate
STANDARDWhy OGG Vorbis and not AAC? Because royalty-free codec saves millions at 100M+ tracks. 5-tier ABR switches between tracks, not mid-track, because audio artifacts are more noticeable than video quality drops.
Core Streaming DesignCDN Edge Caching with Track Prefetch
STANDARDWhy 99% cache hit and not 95% like video? Because songs are replayed thousands of times. Top 1% of catalog (1M tracks) fits in 10 TB per POP. Prefetch next 2-3 tracks 30 seconds early.
High Level System DesignGapless Playback and Crossfade Engine
TRICKYWhy dual-buffer and not single-buffer? Because decoder initialization takes 200-500ms. Buffer A plays, Buffer B decodes the next track 10 seconds early, crossfade in under 50ms.
Core Streaming DesignPlay Count Pipeline for Royalty Accounting
TRICKYWhy exactly-once Kafka and not at-least-once? Because every duplicate play at 139K events/sec means an incorrect royalty payment. 30-second threshold is the industry standard.
Monitoring and Complete SystemPersonalized Recommendation Engine
TRICKYWhy ALS plus audio CNN and not pure collaborative filtering? Because 40K new tracks/day have zero play history. CNN extracts acoustic embeddings for cold-start recommendations.
High Level System DesignAudio File Chunking and Byte-Range Seeking
STANDARDWhy byte-range requests and not HLS segments? Because a 3.5 MB audio file is 1,000x smaller than a 6 GB video. Single-file delivery with seek tables for random access.
Core Streaming DesignOffline Sync and Download Management
STANDARDWhy AES-128-CTR and not AES-CBC? Because CTR allows random seek decryption without decrypting preceding blocks. 30-day DRM license windows balance convenience against rights protection.
Replication and Fault ToleranceRights Management and Licensing Metadata
TRICKYWhy denormalized rights by (track_id, country_code)? Because the hot path is a single-key authorization check at 139K/sec. Normalized schema requires 3 joins per play request.
Database Schema