🔎 Design FB Post Search — System Design Interview Guide
Medium · Search & Indexing
Design a search system for a social platform (like Facebook) that allows users to search posts, people, pages, and groups with results filtered by their social graph and privacy settings.
Open the interactive FB Post Search design on PrepGrind → Drag load balancers, caches, databases, and queues onto a canvas, run a live traffic simulation to watch latency and bottlenecks under load, and follow the full interview walkthrough below — free, in your browser.
Functional requirements
- Full-text search across posts, people, pages, and groups
- Results filtered by user's privacy settings and relationships
- Ranked results: friends' content prioritized over strangers
- Real-time indexing: new posts searchable within seconds
- Faceted search: filter by type, date, author
- Search suggestions as user types
Non-functional requirements & scale
- 3B users; 100B searchable documents
- Search latency < 500ms P95
- Privacy filters must be applied for every result
- Index updates: new post searchable in < 30 seconds
- 99.9% search availability
- Support for 100+ languages with stemming and stopwords
Capacity estimation
Privacy is the hardest part. A post from User A visible only to friends cannot appear in search results for User C (not a friend). This means search results are personalized — same query returns different results per user. Must intersect search results with the privacy-allowed set.
Core entities
- Post — postId, authorId, content, mediaType, createdAt, privacy (public/friends/only-me)
- SearchIndex — docId, type, text, authorId, privacyGroups[], createdAt (in Elasticsearch)
- SearchResult — docId, type, snippet, score, author, createdAt
API design
GET /api/v1/search?q=birthday&type=post&from=-7d— Search with query, type filter, date filter. Returns personalized results.GET /api/v1/search/suggest?q=john— Auto-complete suggestions for people/pages.
High-level design
Post created → Kafka → Indexer writes to Elasticsearch with privacy metadata. Search query → Query Service expands query + fetches user's friend list → Elasticsearch query with privacy filter → re-rank by social graph distance → return results.
Deep dives
🔐 Privacy-Aware Search
Each document in Elasticsearch has a privacyTerms field: ["public"], ["friends:userId"], or []. On search: user-specific privacy filter = ["public", OR "friends:myUserId"]. Elasticsearch query: must-match text AND filter-terms (privacyTerms). Problem: friend list changes → must update privacy terms for old posts? Approach: store friend groups in index; Social Graph Service provides real-time friend list for filter.
📊 Ranking with Social Signals
Base score: BM25 text relevance. Boost factors: (1) Friend authored post → 2× boost. (2) Post from page user follows → 1.5×. (3) Recent (< 7 days) → 1.2×. (4) High engagement (likes/comments) → 1.1×. Re-ranking: first fetch 100 candidates from Elasticsearch → apply social graph scoring → return top 10. Personalization: ML model per user (computationally expensive, done for top users).
⚡ Real-Time Indexing
Post created → Kafka event → Indexer worker fetches post content + privacy settings → Elasticsearch index API. Near-real-time search in Elasticsearch: default 1s refresh interval → new documents searchable within 1s. For trending topics (high-volume): prioritize indexing. For deleted posts: soft delete (mark deleted field true) → search filter excludes deleted.
🌍 Multi-Language
Elasticsearch: one index per language with language-specific analyzer (stemming, stopwords). Post language detected on indexing (FastText language detection). Query: detect query language, route to correct index. Fuzzy matching for typos: Levenshtein distance 1-2 for words > 5 chars. Unicode normalization: "café" and "cafe" match.
Scaling considerations
- Elasticsearch sharded by docId hash; replicas for read scaling
- Privacy filter evaluated at Elasticsearch level (not post-fetch) for efficiency
- Social Graph Service caches friend list in Redis (TTL 5 min) for search
- Indexer consumers auto-scale with Kafka lag
- Search result cache in Redis for popular queries (TTL 30s)
What interviewers expect by level
- Junior: Describe search index, Elasticsearch basics, post indexing pipeline.
- Mid: Privacy-aware query construction, social graph for ranking, real-time indexing pipeline.
- Senior: Privacy term design, multi-language support, re-ranking with ML signals, cache strategy.
- Staff: Privacy enforcement at 3B user scale, cross-language semantic search (embeddings), cost optimization.
Practice more system design case studies
- Design URL Shortener
- Design Social Media Feed
- Design Chat System
- Design Video Streaming
- Design Ride-Sharing Platform
- Design E-Commerce Platform
- Design UPI Payment Gateway
- Design Google Docs
- Design Tinder
- Design Google Drive / Dropbox
- Design Instagram
- Design Type-Ahead Search
- Design Web Crawler
- Design Ticket Booking (BookMyShow)
- Design Pastebin
- Design Notification System
- Design Rate Limiter (Standalone)
- Design Simple Web App
- Design Food Delivery (Swiggy)
- Design Stock Trading System
- Design Live Streaming (Twitch)
- Design Distributed Key-Value Store
- Design Ad Click Aggregation
- Design Monitoring / Metrics (Datadog)
- Design Online Judge (LeetCode)
- Design Yelp
- Design Cache Layer
- Design Message Queue
- Design Full Production Stack
PrepGrind runs entirely in your browser, free, no installation required. Loading the interactive playground…