Whiteboard ScaleTopicsVideo Streaming

YouTube / Video Streaming

VERY COMMON

Video streaming is asked at every FAANG company. It is how YouTube delivers 1 billion hours of video per day to 2 billion monthly users with sub-2-second playback start time. You will design a DAG-based transcoding pipeline that parallelizes FFmpeg workers across 8 resolutions, a CDN layer with 95%+ cache hit ratio serving 46K views per second, and adaptive bitrate streaming that reduces buffering by 90%.

  • Design a DAG-based transcoding pipeline with parallel chunk encoding
  • Architect CDN delivery for 46K views/sec with 95%+ cache hit ratio
  • Implement adaptive bitrate streaming with HLS/DASH protocols
GoogleNetflixAmazonMetaMicrosoftApple
8
Concepts
Deep dives
10
Cheat Items
Quick ref
Elevator Pitch3-minute interview summary

Video streaming for 800M DAU at 46K views/sec. Upload: resumable chunked upload to S3, then DAG-based transcoding splits into 2-second chunks fanning out to parallel FFmpeg workers across 8 resolutions. A 10-min 1080p video transcodes in under 5 minutes. Viewing: HLS adaptive bitrate from CDN edge POPs with 95%+ cache hit ratio. Metadata in MySQL sharded by video_id. Thumbnails in Bigtable. View counts via Kafka, batch-updated every 30 seconds. Storage grows at 25 GB/sec.

Concepts Unlocked8 concepts in this topic

Resumable Upload

EASY

We chose chunked uploads (not single-POST) because large files need fault tolerance. Split into 5-10 MB chunks, track last successful byte offset on the server. On interruption, resume from where we left off. Maximum wasted transfer per drop: 10 MB instead of the full file.

Core Feature Design

Thumbnail Generation

EASY

We chose Bigtable (not S3) for thumbnails because we need sub-10ms random reads at scale. Extract 5 candidate frames at 20/40/50/60/80% of video duration using FFmpeg. Store as 5 KB JPEGs. 1 billion videos x 5 thumbnails = 25 TB total.

Core Feature Design

Adaptive Bitrate Streaming

STANDARD

We chose ABR with HLS/DASH (not fixed-quality delivery) so the player adapts to each viewer's bandwidth. The player measures bandwidth every few seconds and picks the highest rendition that fits. Quality switches happen at segment boundaries (every 2-10 seconds) without rebuffering.

Core Feature Design

CDN Edge Caching

STANDARD

We chose a tiered CDN strategy (not uniform caching) because video popularity follows a power law. Cache segments at 100+ edge POPs. Achieve 95%+ cache hit ratio by combining origin pull for long-tail content with push pre-warming for viral videos. Reduces origin bandwidth from 115 TB/sec to under 6 TB/sec.

High Level Design

Video Chunking and GOP Alignment

STANDARD

We chose GOP-aligned segments (not arbitrary time splits) because each segment must start with an I-frame to be independently decodable. Split video into 2-10 second segments aligned to Group of Pictures boundaries. This enables both ABR quality switching and parallel transcoding.

Core Feature Design

Video Metadata Store

STANDARD

We chose MySQL sharded by video_id (not user_id) because the hot path is metadata lookup by video_id. Store 10 KB of metadata per video. Cache hot videos in Redis with 1-hour TTL. At 46K reads/sec, a 95% cache hit ratio keeps MySQL at 2,300 QPS.

Database Design

Video Transcoding Pipeline

TRICKY

We chose chunk-based parallel encoding (not single-file sequential) because a 4-hour single-machine job becomes a 5-minute distributed job. A DAG-based pipeline splits video into GOP-aligned chunks, fans them to parallel FFmpeg workers via a message queue, and encodes each chunk into 8 resolutions.

Core Feature Design

Video Deduplication

TRICKY

We chose perceptual hashing (not file-hash dedup) because file hashes miss re-encoded copies. Extract keyframes, compute a visual fingerprint, and compare against a fingerprint database. Perceptual hashing catches videos with different codecs, resolutions, or minor edits that are visually identical.

Fault Tolerance