🎬 Design Video Streaming — System Design Interview Guide
Hard · Media & CDN
Design a video streaming platform like YouTube or Netflix where users can upload videos, which are transcoded and served at scale to global audiences with adaptive bitrate streaming.
Open the interactive Video Streaming design on PrepGrind → Drag load balancers, caches, databases, and queues onto a canvas, run a live traffic simulation to watch latency and bottlenecks under load, and follow the full interview walkthrough below — free, in your browser.
Functional requirements
- Users can upload videos (up to 10GB)
- Videos are transcoded into multiple resolutions (360p, 720p, 1080p, 4K)
- Videos are streamed with adaptive bitrate (ABR) based on network conditions
- Users can search, browse, and watch videos
- Support comments, likes, view counts
- Recommendation feed on homepage
Non-functional requirements & scale
- 2B users, 500M hours of video watched per day
- 500 hours of video uploaded every minute (YouTube scale)
- Video start time < 2 seconds (P95)
- Buffering ratio < 0.5%
- CDN cache hit ratio > 95% for popular content
- Uploads must be resumable on network failure
Capacity estimation
500 hours/min uploads = 30,000 min/min = 30,000 raw files/min. Each raw video → 5 transcoded resolutions. Storage: 500h/min × 60 × 24 × 365 days = 260M hours/year × 1GB/hour avg ≈ 260 PB/year. Transcoding is CPU-intensive and must be parallelized.
Core entities
- Video — videoId, uploaderId, title, description, status (processing|ready), duration, viewCount, createdAt
- VideoVariant — variantId, videoId, resolution, bitrate, s3Key, status
- User — userId, channelName, subscriberCount, watchHistory[]
- Comment — commentId, videoId, userId, content, createdAt, likeCount
API design
POST /api/v1/videos/upload-url— Get pre-signed S3 URL for direct upload. Returns { uploadUrl, videoId }.PUT /upload/s3 (pre-signed)— Client uploads directly to S3. Triggers Lambda → SQS → Transcoder.GET /api/v1/videos/:id/manifest.m3u8— Returns HLS adaptive bitrate manifest with all quality levels.GET /api/v1/search?q=&cursor=— Full-text video search via Elasticsearch.
High-level design
Upload: client → S3 (pre-signed URL) → S3 event → SQS → Transcoding Workers → output variants to S3 → update metadata DB. Playback: client fetches HLS manifest from CDN → streams video segments from CDN edge nodes.
Deep dives
🎞️ Video Transcoding Pipeline
Split video into 10-second chunks → parallelize encoding across worker fleet → reassemble HLS/DASH segments. Use FFmpeg. For a 1-hour video: 360 chunks × 5 resolutions = 1800 jobs. Distribute via SQS. Workers are EC2 spot instances (cost savings). Total transcode time: ~15 min for 1hr video at 100 workers.
📡 Adaptive Bitrate Streaming (HLS)
HLS manifest (.m3u8) lists all quality variants. Video player monitors bandwidth: if segment download time > threshold, switch to lower quality. Segments are 6-10s. CDN caches segments. Key: all segments for a video are at fixed CDN paths — same URL regardless of viewer location.
🌍 CDN Strategy
Popular videos: cached at CDN edge. Cache-Control: max-age=31536000 (1 year) for video segments (immutable after transcode). Long-tail videos (watched rarely): served from S3 origin. CDN handles 95%+ of traffic. Netflix uses Open Connect Appliances — dedicated CDN hardware at ISPs, 0 public internet hops.
📊 View Count at Scale
Naive: increment counter on every view → write hotspot. Solution: accept approximate counts. Use Redis INCR per videoId (fast, in-memory). Periodically flush to DB (every 30s). Use Kafka for view events → Flink for real-time aggregation. Dedup views: same user, same video, <30s window = 1 view.
Scaling considerations
- Video storage in S3 with intelligent tiering (hot/warm/cold by view frequency)
- CDN with anycast routing — user automatically hits nearest POP
- Transcoding auto-scales with SQS queue depth using EC2 spot fleet
- MySQL sharded by videoId for metadata; read replicas for serving
- Elasticsearch for search; update index asynchronously after transcode complete
What interviewers expect by level
- Junior: Describe upload-to-S3, transcode flow, CDN for serving. Understand why pre-signed URLs are used.
- Mid: Design HLS adaptive streaming, SQS-driven transcode pipeline, chunked parallel encoding, CDN caching strategy.
- Senior: Full pipeline with resumable uploads (S3 multipart), cost-optimized spot transcoding, CDN cache invalidation, view count at scale.
- Staff: Multi-CDN failover strategy, P2P streaming for live events, cost modeling for 260PB/year storage, GDPR video deletion.
Practice more system design case studies
- Design URL Shortener
- Design Social Media Feed
- Design Chat System
- Design Ride-Sharing Platform
- Design E-Commerce Platform
- Design UPI Payment Gateway
- Design Google Docs
- Design Tinder
- Design Google Drive / Dropbox
- Design Instagram
- Design Type-Ahead Search
- Design Web Crawler
- Design Ticket Booking (BookMyShow)
- Design Pastebin
- Design Notification System
- Design Rate Limiter (Standalone)
- Design Simple Web App
- Design Food Delivery (Swiggy)
- Design Stock Trading System
- Design Live Streaming (Twitch)
- Design Distributed Key-Value Store
- Design Ad Click Aggregation
- Design Monitoring / Metrics (Datadog)
- Design Online Judge (LeetCode)
- Design FB Post Search
- Design Yelp
- Design Cache Layer
- Design Message Queue
- Design Full Production Stack
PrepGrind runs entirely in your browser, free, no installation required. Loading the interactive playground…