SDI.
All Architectures

Twitter / X

Social media platform handling 500M+ tweets per day with real-time timeline delivery, trend detection, and search across billions of tweets.

500M+

daily Tweets

~300K requests/sec

timeline Reads

~5M timeline writes/sec during peak

fanout Writes

Billions of tweets indexed

search Index

~150TB Redis for timelines

cache Size

Architecture Diagram

ClientclientAPI GatewaygatewayTweet ServiceserviceTimeline ServiceserviceFanout ServiceserviceSearch ServiceserviceTrends ServiceserviceTimeline CachecacheTweet StoredatabaseSocial GraphdatabaseMedia CDNcdnKafkaqueue

Data Flow

1

ClientAPI GatewayPost Tweet

User submits a tweet. Gateway validates auth and routes to Tweet Service.

2

API GatewayTweet ServiceCreate

Tweet Service validates content, stores tweet, and publishes to Kafka.

3

Tweet ServiceTweet StorePersist

Tweet stored durably in Manhattan (distributed key-value store).

4

Tweet ServiceKafkaPublish Event

Tweet creation event published to Kafka for async processing.

5

KafkaFanout ServiceTrigger Fanout

Fanout Service consumes events and distributes to follower timelines.

6

Fanout ServiceSocial GraphGet Followers

Queries FlockDB for the tweeter's follower list.

7

Fanout ServiceTimeline CacheWrite Timelines

Prepends tweet ID to each follower's cached timeline in Redis.

8

ClientTimeline ServiceLoad Timeline

User opens app. Timeline Service reads from cache + merges real-time tweets.

9

Timeline ServiceTimeline CacheRead Cache

Pre-computed timeline read from Redis. For celebrities, fan-out on read is used instead.

10

KafkaSearch ServiceIndex

Tweets indexed in near-real-time for search (Earlybird inverted index).

11

KafkaTrends ServiceDetect Trends

Streaming algorithms detect emerging topics from tweet velocity.

Key Architectural Decisions

  • Hybrid fan-out: fan-out on write for regular users (<10K followers), fan-out on read for celebrities — balances write amplification vs read latency
  • Redis for timeline cache — O(1) prepend operations and sorted sets make it ideal for timeline assembly
  • Manhattan (custom distributed DB) over Cassandra for tweet storage — optimized for Twitter's specific access patterns
  • Earlybird (custom Lucene-based search) for real-time tweet indexing within seconds of posting
  • Separate read and write paths to independently scale timeline reads vs tweet ingestion

Tradeoffs

Strengths

  • Hybrid fanout elegantly handles the celebrity follower problem (Lady Gaga has 80M+ followers)
  • Pre-computed timelines in Redis provide sub-100ms timeline loads
  • Real-time search indexing means tweets are searchable within seconds
  • Kafka decouples tweet ingestion from all downstream processing

Weaknesses

  • Fan-out on write creates massive write amplification — one celebrity tweet generates millions of cache writes
  • Timeline cache requires ~150TB of Redis, a significant operational cost
  • Hybrid approach adds complexity — must determine user's fanout strategy dynamically
  • Delete/edit propagation across all cached timelines is complex and eventually consistent

Interview Drilldown Questions

  • How do you decide the follower threshold between fan-out on write and fan-out on read?
  • How would you handle tweet deletion across millions of pre-computed timelines?
  • How does the trending algorithm distinguish genuine trends from spam/bots?
  • How would you design the notification system for @mentions and replies?
  • What's the strategy for timeline ranking vs chronological ordering?

Components

client

Client

Web, iOS, Android — timeline rendering and tweet composition

Learn more →
gateway

API Gateway

Routes requests, handles auth, rate limiting, and API versioning

Learn more →
service

Tweet Service

Handles tweet creation, validation, and fanout triggering

Learn more →
service

Timeline Service

Assembles user home timeline from pre-computed and real-time sources

Learn more →
service

Fanout Service

Distributes tweets to follower timelines — fan-out on write for most users

Learn more →
service

Search Service

Real-time tweet indexing and search (Earlybird)

Learn more →
service

Trends Service

Detects trending topics using streaming algorithms

cache

Timeline Cache

Redis clusters storing pre-computed timelines for each user

Learn more →
database

Tweet Store

Durable storage for all tweets (Manhattan distributed DB)

Learn more →
database

Social Graph

Follower/following relationships (FlockDB)

Learn more →
cdn

Media CDN

Images, videos, GIFs served from edge locations

Learn more →
queue

Kafka

Event stream for fanout, analytics, and real-time processing

Learn more →

Source: editorial — Synthesized from Twitter engineering blog, public architecture talks, and system design references

Command Palette

Search for a command to run...