💳 Design UPI Payment Gateway — System Design Interview Guide
Hard · Payments & Fintech
Design a UPI-based payment system like Google Pay or PhonePe that handles instant bank-to-bank transfers, handles 1B+ transactions per day, and ensures zero double spends.
Open the interactive UPI Payment Gateway design on PrepGrind → Drag load balancers, caches, databases, and queues onto a canvas, run a live traffic simulation to watch latency and bottlenecks under load, and follow the full interview walkthrough below — free, in your browser.
Functional requirements
- Register and link bank accounts via UPI VPA (Virtual Payment Address)
- Initiate P2P payments using UPI ID or QR code
- Collect payments for merchants
- View transaction history
- Handle refunds and disputes
Non-functional requirements & scale
- 1B transactions/day (~11,600 TPS); peak: 50,000 TPS on festivals
- Payment completion < 3 seconds end-to-end
- Zero double debits — exactly-once semantics
- Regulatory compliance: RBI PCI-DSS, NPCI guidelines
- 99.999% uptime (5 nines) for payment path
- Full audit trail for every transaction state change
Capacity estimation
UPI works via NPCI (National Payments Corp. of India) as the switch. Each app (PSP) sends payment instruction to NPCI which debits payer bank and credits payee bank. Idempotency is critical — network retries must not cause double debit.
Core entities
- Account — accountId, userId, vpa, bankAccountNumber (tokenized), ifsc, isLinked
- Transaction — txnId (UUID), payerVpa, payeeVpa, amount, status, idempotencyKey, createdAt, updatedAt
- TransactionLog — logId, txnId, state, message, timestamp (append-only audit)
- Merchant — merchantId, mcc, vpa, settlementAccount, dailyLimit
API design
POST /api/v1/payments— Initiate payment. Body: { payerVpa, payeeVpa, amount, note, idempotencyKey }.GET /api/v1/payments/:txnId— Poll transaction status.POST /api/v1/payments/:txnId/refund— Initiate refund. Creates reverse transaction.GET /api/v1/transactions?cursor=&limit=20— Paginated transaction history.
High-level design
Payment request → idempotency check (Redis) → write PENDING to DB → send to NPCI switch → async response → update DB to SUCCESS/FAILED → notify user via WebSocket/push.
Deep dives
🔄 Idempotency & Exactly-Once
Client generates UUID idempotencyKey. Payment Service: Redis SET key txnId NX EX 300. If SET fails, key exists — return existing txnId (duplicate detected). Network retry safe. NPCI also assigns unique transaction reference. DB: INSERT with UNIQUE constraint on idempotencyKey. Never retry against NPCI without same reference number.
📊 Transaction State Machine
States: INITIATED → PENDING → SENT_TO_NPCI → DEBIT_SUCCESS → CREDIT_SUCCESS (COMPLETED) or FAILED or REVERSED. Each state transition appended to TransactionLog (immutable audit). Use DB row version lock for concurrent state updates. Saga pattern for multi-step: debit payer bank → credit payee bank → confirm.
⚡ 50K TPS on Festival Days
Horizontal scale the stateless Payment Service. Redis cluster for idempotency checks (sub-millisecond). DB write bottleneck: use Postgres connection pooling (PgBouncer) + write-ahead log batching. NPCI rate limit: queue overflow in Kafka; process with backpressure. Circuit breaker on NPCI — fall back to "pending" with retry.
🔐 Security
UPI PIN never leaves device — encrypted with device key + server public key. MPIN validation in HSM (Hardware Security Module). TLS 1.3 for all transport. Bank account numbers tokenized — system stores token, not actual account. All transactions signed with customer certificate. Fraud detection ML model scores each transaction in <50ms.
Scaling considerations
- Partition transaction DB by user_id hash for write distribution
- Read replicas for transaction history queries
- Redis pipeline for batch idempotency key lookups during burst
- Kafka with replication factor 3 — no message loss for audit trail
- Multi-AZ deployment; active-active across 2 data centers (RBI mandate)
What interviewers expect by level
- Junior: Describe payment flow: initiate → bank debit → bank credit → confirm. Understand why idempotency matters.
- Mid: Design idempotency with Redis, transaction state machine, retry-safe API contract.
- Senior: NPCI integration, saga for distributed debit/credit, 50K TPS architecture, fraud detection pipeline.
- Staff: Active-active multi-DC for 5-nines uptime, HSM integration, regulatory reporting, global money movement.
Practice more system design case studies
- Design URL Shortener
- Design Social Media Feed
- Design Chat System
- Design Video Streaming
- Design Ride-Sharing Platform
- Design E-Commerce Platform
- Design Google Docs
- Design Tinder
- Design Google Drive / Dropbox
- Design Instagram
- Design Type-Ahead Search
- Design Web Crawler
- Design Ticket Booking (BookMyShow)
- Design Pastebin
- Design Notification System
- Design Rate Limiter (Standalone)
- Design Simple Web App
- Design Food Delivery (Swiggy)
- Design Stock Trading System
- Design Live Streaming (Twitch)
- Design Distributed Key-Value Store
- Design Ad Click Aggregation
- Design Monitoring / Metrics (Datadog)
- Design Online Judge (LeetCode)
- Design FB Post Search
- Design Yelp
- Design Cache Layer
- Design Message Queue
- Design Full Production Stack
PrepGrind runs entirely in your browser, free, no installation required. Loading the interactive playground…