🌐 Design Simple Web App — System Design Interview Guide

Easy · Fundamentals

Design the foundational architecture for a scalable web application that can serve millions of users with high availability, starting from a single server and scaling out.

Open the interactive Simple Web App design on PrepGrind → Drag load balancers, caches, databases, and queues onto a canvas, run a live traffic simulation to watch latency and bottlenecks under load, and follow the full interview walkthrough below — free, in your browser.

Functional requirements

Serve static assets (HTML/CSS/JS/images)
Handle dynamic API requests
User authentication and session management
CRUD operations on application data
Support multiple concurrent users

Non-functional requirements & scale

Scale from 1K to 10M users without re-architecture
API response time < 200ms (P95)
99.9% availability (< 8.7 hours downtime/year)
Horizontal scalability: add servers to handle more traffic
Data durability: no data loss on server failure

Capacity estimation

Start single-server. When single server hits CPU/memory limits, scale horizontally: add web servers behind load balancer. DB becomes bottleneck: add replicas for reads, then shard for writes. Add cache layer when DB reads are hot.

Core entities

User — userId, email, passwordHash, createdAt, role
Session — sessionId (JWT), userId, expiresAt, deviceInfo
AppData — Varies by application — core domain entities

API design

POST /api/auth/login — Authenticate user. Returns JWT access token + refresh token.
GET /api/users/me — Get current user profile. Requires Authorization header.
CRUD /api/resources — Standard CRUD operations on application resources.

High-level design

Clients → CDN (static assets) → Load Balancer → Web Server cluster → DB Primary + Read Replicas. Cache layer (Redis) in front of DB for hot data. Sessions in Redis (stateless servers).

Deep dives

📈 Scaling Progression

Phase 1: Single server (< 1K users). Phase 2: Separate DB server (< 10K users). Phase 3: Load balancer + 2 web servers (< 100K users). Phase 4: Add Redis cache + CDN (< 1M users). Phase 5: DB read replicas (< 10M users). Phase 6: DB sharding + microservices (10M+ users). Each phase addresses the specific bottleneck.

🔐 Session Management

Stateful sessions: session stored in server memory → problem with multiple servers (user gets load-balanced to different server, loses session). Solutions: (1) Sticky sessions (same user always hits same server — single point of failure). (2) Centralized session store (Redis) — server looks up session on every request. (3) Stateless JWT — all session data in token, no server state.

🗄️ Database Scaling

Read replicas: route SELECT queries to replicas (75% of DB queries). Primary handles writes only. Replication lag: replicas may be 10-100ms behind. Cache: Redis cache in front of DB, cache-aside pattern. Sharding (last resort): partition data by userId hash. Increases complexity significantly — avoid until necessary.

🌐 CDN for Static Assets

CDN caches static files at edge POPs worldwide. Benefits: lower latency (user hits nearest server), reduced origin load, automatic DDoS mitigation. Cache-Control headers: HTML short TTL (5 min), CSS/JS long TTL (1 year with content hash in filename). When JS changes, change filename — cache busting.

Scaling considerations

Stateless web servers — session in Redis enables horizontal scaling
CDN eliminates static asset load from origin servers
Read replicas handle read-heavy workloads (typical apps: 80% reads)
Connection pooling (PgBouncer) prevents DB connection exhaustion
Health checks on load balancer — remove unhealthy instances automatically

What interviewers expect by level

Junior: Draw the 3-tier architecture (client → server → DB). Know what a load balancer does. Understand stateless servers.
Mid: Redis for sessions and caching, DB read replicas, CDN strategy, horizontal vs vertical scaling.
Senior: Capacity planning, scaling progression from 1K to 10M users, cache invalidation strategies, DB sharding threshold.
Staff: Multi-region architecture, cost optimization at scale, observability stack (metrics/logging/tracing), SLO/SLA design.

Practice more system design case studies

PrepGrind runs entirely in your browser, free, no installation required. Loading the interactive playground…