💬 Design Chat System — System Design Interview Guide

Medium · Messaging & Real-Time

Design a real-time messaging system like WhatsApp or Slack that supports 1-to-1 messaging, group chats, message delivery guarantees, and online presence.

Open the interactive Chat System design on PrepGrind → Drag load balancers, caches, databases, and queues onto a canvas, run a live traffic simulation to watch latency and bottlenecks under load, and follow the full interview walkthrough below — free, in your browser.

Functional requirements

Non-functional requirements & scale

Capacity estimation

500M DAU, each sends ~40 messages/day = 20B messages/day = 231K messages/sec. Each message ~1KB. Storage: 20B × 1KB = 20TB/day. Need persistent connections (WebSocket) for real-time delivery. Stateless HTTP cannot push messages to clients.

Core entities

API design

High-level design

Client connects via WebSocket to a Chat Server. Message sent → stored in Cassandra → routed to recipient's WebSocket server via Redis Pub/Sub → delivered over WS or push notification if offline.

Deep dives

🔌 WebSocket at Scale

Each Chat Server holds N persistent WS connections. Problem: User A on Server 1, User B on Server 2 — how to route? Solution: Redis Pub/Sub. Server 1 publishes to channel "user:B". Server 2 subscribes to "user:B" and pushes to User B's connection. Presence service tracks which server each user is on.

✉️ Message Delivery Guarantee

At-least-once delivery: ack after writing to Cassandra. Client uses message ID to deduplicate. Delivery receipts: recipient sends ack back over WS → server updates message status in DB → notifies sender. Read receipts: client sends read event for chatId up to messageId.

👥 Group Chat Scaling

For small groups (<100): fan-out to all members' WS servers via Redis Pub/Sub. For large groups (100-500): store group membership in DB; on message, publish once to group topic; each member's server subscribes to group topic. Cassandra stores messages with chatId partition key for efficient range reads.

📱 Offline Delivery

If recipient's WS server returns offline: push via APNs (iOS) or FCM (Android). Message is stored in Cassandra regardless. On reconnect: client sends lastSeenMessageId; server returns all undelivered messages since then. This handles network interruptions transparently.

Scaling considerations

What interviewers expect by level

Practice more system design case studies

PrepGrind runs entirely in your browser, free, no installation required. Loading the interactive playground…