🎬 Design Video Streaming — System Design Interview Guide

Hard · Media & CDN

Design a video streaming platform like YouTube or Netflix where users can upload videos, which are transcoded and served at scale to global audiences with adaptive bitrate streaming.

Open the interactive Video Streaming design on PrepGrind → Drag load balancers, caches, databases, and queues onto a canvas, run a live traffic simulation to watch latency and bottlenecks under load, and follow the full interview walkthrough below — free, in your browser.

Functional requirements

Users can upload videos (up to 10GB)
Videos are transcoded into multiple resolutions (360p, 720p, 1080p, 4K)
Videos are streamed with adaptive bitrate (ABR) based on network conditions
Users can search, browse, and watch videos
Support comments, likes, view counts
Recommendation feed on homepage

Non-functional requirements & scale

2B users, 500M hours of video watched per day
500 hours of video uploaded every minute (YouTube scale)
Video start time < 2 seconds (P95)
Buffering ratio < 0.5%
CDN cache hit ratio > 95% for popular content
Uploads must be resumable on network failure

Capacity estimation

500 hours/min uploads = 30,000 min/min = 30,000 raw files/min. Each raw video → 5 transcoded resolutions. Storage: 500h/min × 60 × 24 × 365 days = 260M hours/year × 1GB/hour avg ≈ 260 PB/year. Transcoding is CPU-intensive and must be parallelized.

Core entities

Video — videoId, uploaderId, title, description, status (processing|ready), duration, viewCount, createdAt
VideoVariant — variantId, videoId, resolution, bitrate, s3Key, status
User — userId, channelName, subscriberCount, watchHistory[]
Comment — commentId, videoId, userId, content, createdAt, likeCount

API design

POST /api/v1/videos/upload-url — Get pre-signed S3 URL for direct upload. Returns { uploadUrl, videoId }.
PUT /upload/s3 (pre-signed) — Client uploads directly to S3. Triggers Lambda → SQS → Transcoder.
GET /api/v1/videos/:id/manifest.m3u8 — Returns HLS adaptive bitrate manifest with all quality levels.
GET /api/v1/search?q=&cursor= — Full-text video search via Elasticsearch.

High-level design

Upload: client → S3 (pre-signed URL) → S3 event → SQS → Transcoding Workers → output variants to S3 → update metadata DB. Playback: client fetches HLS manifest from CDN → streams video segments from CDN edge nodes.

Deep dives

🎞️ Video Transcoding Pipeline

Split video into 10-second chunks → parallelize encoding across worker fleet → reassemble HLS/DASH segments. Use FFmpeg. For a 1-hour video: 360 chunks × 5 resolutions = 1800 jobs. Distribute via SQS. Workers are EC2 spot instances (cost savings). Total transcode time: ~15 min for 1hr video at 100 workers.

📡 Adaptive Bitrate Streaming (HLS)

HLS manifest (.m3u8) lists all quality variants. Video player monitors bandwidth: if segment download time > threshold, switch to lower quality. Segments are 6-10s. CDN caches segments. Key: all segments for a video are at fixed CDN paths — same URL regardless of viewer location.

🌍 CDN Strategy

Popular videos: cached at CDN edge. Cache-Control: max-age=31536000 (1 year) for video segments (immutable after transcode). Long-tail videos (watched rarely): served from S3 origin. CDN handles 95%+ of traffic. Netflix uses Open Connect Appliances — dedicated CDN hardware at ISPs, 0 public internet hops.

📊 View Count at Scale

Naive: increment counter on every view → write hotspot. Solution: accept approximate counts. Use Redis INCR per videoId (fast, in-memory). Periodically flush to DB (every 30s). Use Kafka for view events → Flink for real-time aggregation. Dedup views: same user, same video, <30s window = 1 view.

Scaling considerations

Video storage in S3 with intelligent tiering (hot/warm/cold by view frequency)
CDN with anycast routing — user automatically hits nearest POP
Transcoding auto-scales with SQS queue depth using EC2 spot fleet
MySQL sharded by videoId for metadata; read replicas for serving
Elasticsearch for search; update index asynchronously after transcode complete

What interviewers expect by level

Junior: Describe upload-to-S3, transcode flow, CDN for serving. Understand why pre-signed URLs are used.
Mid: Design HLS adaptive streaming, SQS-driven transcode pipeline, chunked parallel encoding, CDN caching strategy.
Senior: Full pipeline with resumable uploads (S3 multipart), cost-optimized spot transcoding, CDN cache invalidation, view count at scale.
Staff: Multi-CDN failover strategy, P2P streaming for live events, cost modeling for 260PB/year storage, GDPR video deletion.

Practice more system design case studies

PrepGrind runs entirely in your browser, free, no installation required. Loading the interactive playground…