⚖️ Design Online Judge (LeetCode) — System Design Interview Guide

Medium · Sandboxing & Execution

Design an online code judge like LeetCode or HackerRank where users submit code solutions, which are compiled and executed in a sandboxed environment against test cases, returning results within seconds.

Open the interactive Online Judge (LeetCode) design on PrepGrind → Drag load balancers, caches, databases, and queues onto a canvas, run a live traffic simulation to watch latency and bottlenecks under load, and follow the full interview walkthrough below — free, in your browser.

Functional requirements

Non-functional requirements & scale

Capacity estimation

Core challenge: run arbitrary user code safely. Need OS-level isolation (Docker/seccomp/cgroups). Queue submissions → worker picks up → run in sandbox → return result. Peak during contests: 1K submissions/sec, each takes 1-3s → need 1000-3000 workers.

Core entities

API design

High-level design

Submission → Queue (Kafka/SQS) → Execution Worker (isolated container) runs all test cases → writes results to DB → notifies user via WebSocket.

Deep dives

🔒 Sandboxing Code Execution

Multi-layer isolation: (1) Container (Docker): filesystem isolation, process isolation. (2) seccomp: whitelist only safe syscalls (read, write, execve) — block network, fork bombs, file writes outside /tmp. (3) cgroups: limit CPU (1 core), memory (256MB), process count (50). (4) Network namespace: no network access. (5) Time limit: kill container after timeLimit + 1s. (6) User: run as nobody (uid=65534), no sudo.

⚡ Worker Pool Scaling

Each worker handles one submission at a time (per container). Peak: 1K submissions/sec × 3s each = 3K concurrent workers. EC2 Spot instances: cheap for short-lived workloads. Auto-scaling: SQS queue depth triggers EC2 fleet scale-out. Worker starts container, runs all N test cases sequentially, sends results back. Container recycled after each submission (clean state).

🗃️ Test Case Security

Test cases stored in S3, encrypted with KMS. Workers fetch test cases at runtime via signed URL (valid 60s). Test case content never returned to user (only pass/fail + runtime). Code stored encrypted in DB (user can view their own). Diff-based checker for floating point answers. Special judge: custom checker code for problems with multiple valid outputs.

📊 Leaderboard

Contest leaderboard: rank by problems solved (primary), total penalty time (secondary). MySQL for small contests. For large contests (100K participants): pre-compute rankings in Redis sorted set. Rank = ZREVRANK key userId. Update on every accepted submission. Show leaderboard as of specific time for fairness analysis. Snapshots every 5 min during contest.

Scaling considerations

What interviewers expect by level

Practice more system design case studies

PrepGrind runs entirely in your browser, free, no installation required. Loading the interactive playground…