🌐 Design Simple Web App — System Design Interview Guide

Easy · Fundamentals

Design the foundational architecture for a scalable web application that can serve millions of users with high availability, starting from a single server and scaling out.

Open the interactive Simple Web App design on PrepGrind → Drag load balancers, caches, databases, and queues onto a canvas, run a live traffic simulation to watch latency and bottlenecks under load, and follow the full interview walkthrough below — free, in your browser.

Functional requirements

Non-functional requirements & scale

Capacity estimation

Start single-server. When single server hits CPU/memory limits, scale horizontally: add web servers behind load balancer. DB becomes bottleneck: add replicas for reads, then shard for writes. Add cache layer when DB reads are hot.

Core entities

API design

High-level design

Clients → CDN (static assets) → Load Balancer → Web Server cluster → DB Primary + Read Replicas. Cache layer (Redis) in front of DB for hot data. Sessions in Redis (stateless servers).

Deep dives

📈 Scaling Progression

Phase 1: Single server (< 1K users). Phase 2: Separate DB server (< 10K users). Phase 3: Load balancer + 2 web servers (< 100K users). Phase 4: Add Redis cache + CDN (< 1M users). Phase 5: DB read replicas (< 10M users). Phase 6: DB sharding + microservices (10M+ users). Each phase addresses the specific bottleneck.

🔐 Session Management

Stateful sessions: session stored in server memory → problem with multiple servers (user gets load-balanced to different server, loses session). Solutions: (1) Sticky sessions (same user always hits same server — single point of failure). (2) Centralized session store (Redis) — server looks up session on every request. (3) Stateless JWT — all session data in token, no server state.

🗄️ Database Scaling

Read replicas: route SELECT queries to replicas (75% of DB queries). Primary handles writes only. Replication lag: replicas may be 10-100ms behind. Cache: Redis cache in front of DB, cache-aside pattern. Sharding (last resort): partition data by userId hash. Increases complexity significantly — avoid until necessary.

🌐 CDN for Static Assets

CDN caches static files at edge POPs worldwide. Benefits: lower latency (user hits nearest server), reduced origin load, automatic DDoS mitigation. Cache-Control headers: HTML short TTL (5 min), CSS/JS long TTL (1 year with content hash in filename). When JS changes, change filename — cache busting.

Scaling considerations

What interviewers expect by level

Practice more system design case studies

PrepGrind runs entirely in your browser, free, no installation required. Loading the interactive playground…