🎬 Design Video Streaming — System Design Interview Guide

Hard · Media & CDN

Design a video streaming platform like YouTube or Netflix where users can upload videos, which are transcoded and served at scale to global audiences with adaptive bitrate streaming.

Open the interactive Video Streaming design on PrepGrind → Drag load balancers, caches, databases, and queues onto a canvas, run a live traffic simulation to watch latency and bottlenecks under load, and follow the full interview walkthrough below — free, in your browser.

Functional requirements

Non-functional requirements & scale

Capacity estimation

500 hours/min uploads = 30,000 min/min = 30,000 raw files/min. Each raw video → 5 transcoded resolutions. Storage: 500h/min × 60 × 24 × 365 days = 260M hours/year × 1GB/hour avg ≈ 260 PB/year. Transcoding is CPU-intensive and must be parallelized.

Core entities

API design

High-level design

Upload: client → S3 (pre-signed URL) → S3 event → SQS → Transcoding Workers → output variants to S3 → update metadata DB. Playback: client fetches HLS manifest from CDN → streams video segments from CDN edge nodes.

Deep dives

🎞️ Video Transcoding Pipeline

Split video into 10-second chunks → parallelize encoding across worker fleet → reassemble HLS/DASH segments. Use FFmpeg. For a 1-hour video: 360 chunks × 5 resolutions = 1800 jobs. Distribute via SQS. Workers are EC2 spot instances (cost savings). Total transcode time: ~15 min for 1hr video at 100 workers.

📡 Adaptive Bitrate Streaming (HLS)

HLS manifest (.m3u8) lists all quality variants. Video player monitors bandwidth: if segment download time > threshold, switch to lower quality. Segments are 6-10s. CDN caches segments. Key: all segments for a video are at fixed CDN paths — same URL regardless of viewer location.

🌍 CDN Strategy

Popular videos: cached at CDN edge. Cache-Control: max-age=31536000 (1 year) for video segments (immutable after transcode). Long-tail videos (watched rarely): served from S3 origin. CDN handles 95%+ of traffic. Netflix uses Open Connect Appliances — dedicated CDN hardware at ISPs, 0 public internet hops.

📊 View Count at Scale

Naive: increment counter on every view → write hotspot. Solution: accept approximate counts. Use Redis INCR per videoId (fast, in-memory). Periodically flush to DB (every 30s). Use Kafka for view events → Flink for real-time aggregation. Dedup views: same user, same video, <30s window = 1 view.

Scaling considerations

What interviewers expect by level

Practice more system design case studies

PrepGrind runs entirely in your browser, free, no installation required. Loading the interactive playground…