🔎 Design FB Post Search — System Design Interview Guide

Medium · Search & Indexing

Design a search system for a social platform (like Facebook) that allows users to search posts, people, pages, and groups with results filtered by their social graph and privacy settings.

Open the interactive FB Post Search design on PrepGrind → Drag load balancers, caches, databases, and queues onto a canvas, run a live traffic simulation to watch latency and bottlenecks under load, and follow the full interview walkthrough below — free, in your browser.

Functional requirements

Non-functional requirements & scale

Capacity estimation

Privacy is the hardest part. A post from User A visible only to friends cannot appear in search results for User C (not a friend). This means search results are personalized — same query returns different results per user. Must intersect search results with the privacy-allowed set.

Core entities

API design

High-level design

Post created → Kafka → Indexer writes to Elasticsearch with privacy metadata. Search query → Query Service expands query + fetches user's friend list → Elasticsearch query with privacy filter → re-rank by social graph distance → return results.

Deep dives

🔐 Privacy-Aware Search

Each document in Elasticsearch has a privacyTerms field: ["public"], ["friends:userId"], or []. On search: user-specific privacy filter = ["public", OR "friends:myUserId"]. Elasticsearch query: must-match text AND filter-terms (privacyTerms). Problem: friend list changes → must update privacy terms for old posts? Approach: store friend groups in index; Social Graph Service provides real-time friend list for filter.

📊 Ranking with Social Signals

Base score: BM25 text relevance. Boost factors: (1) Friend authored post → 2× boost. (2) Post from page user follows → 1.5×. (3) Recent (< 7 days) → 1.2×. (4) High engagement (likes/comments) → 1.1×. Re-ranking: first fetch 100 candidates from Elasticsearch → apply social graph scoring → return top 10. Personalization: ML model per user (computationally expensive, done for top users).

⚡ Real-Time Indexing

Post created → Kafka event → Indexer worker fetches post content + privacy settings → Elasticsearch index API. Near-real-time search in Elasticsearch: default 1s refresh interval → new documents searchable within 1s. For trending topics (high-volume): prioritize indexing. For deleted posts: soft delete (mark deleted field true) → search filter excludes deleted.

🌍 Multi-Language

Elasticsearch: one index per language with language-specific analyzer (stemming, stopwords). Post language detected on indexing (FastText language detection). Query: detect query language, route to correct index. Fuzzy matching for typos: Levenshtein distance 1-2 for words > 5 chars. Unicode normalization: "café" and "cafe" match.

Scaling considerations

What interviewers expect by level

Practice more system design case studies

PrepGrind runs entirely in your browser, free, no installation required. Loading the interactive playground…