Push vs Pull Architecture
Push architectures precompute and deliver data to recipients ahead of time (fan-out-on-write), while pull architectures compute data on demand when requested (fan-out-on-read) — and the choice depends on your read-to-write ratio and follower distribution.
Push (fan-out-on-write): When a user posts content, the system immediately writes a copy to every follower's feed/inbox. Reads are instant (pre-materialized), but writes are expensive — a celebrity with 50M followers means 50M write operations per post.
Pull (fan-out-on-read): When a user opens their feed, the system queries all followed accounts' recent posts and merges them in real-time. Writes are cheap (store once), but reads are expensive — opening your feed requires querying hundreds of sources and sorting.
Twitter uses a hybrid: push for normal users (pre-materialized timelines), pull for celebrity accounts (Lady Gaga's tweets are fetched on-demand and merged into your timeline at read time). This avoids the write amplification of pushing to 50M feeds.
Tradeoffs
Push (Fan-Out-On-Write) — Strengths
- Ultra-fast reads — data is pre-computed and waiting
- Predictable read latency — no variable query costs
- Real-time delivery — recipients get data immediately
- Read path is trivially simple — just fetch from cache/inbox
Push — Weaknesses
- Write amplification — one write becomes N writes (one per recipient)
- Hot-user problem — celebrity accounts cause write storms
- Wasted work — pre-computing feeds for users who may never read them
- Harder to change ranking/filtering — pre-computed data can't easily incorporate real-time context
Pull (Fan-Out-On-Read) — Strengths
- Minimal write cost — store once, compute on demand
- Always fresh — computed at request time with latest data and context
- Natural backpressure — consumers control their consumption rate
- No wasted computation for inactive users
Pull — Weaknesses
- Higher read latency — computing on demand takes time
- Variable read cost — users following 1000 accounts vs 10 have very different query costs
- Peak load sensitivity — many users opening feeds simultaneously creates query storms
- Requires sophisticated caching to maintain acceptable latency
The Hybrid Approach
Most production systems use both: push for bounded-fanout scenarios (normal users, notifications), pull for unbounded-fanout (celebrities) and ML-ranked content. The threshold varies: Twitter uses ~10K followers, Instagram uses ML ranking as the tipping point.
Likely Follow-Up Questions
- How did Twitter solve the celebrity fan-out problem?
- When would you choose polling over webhooks despite the higher latency?
- How does moving from a chronological to a ranked feed change the push/pull tradeoff?
- What's write amplification and how does it affect push architectures at scale?
- How would you design a notification system that works for both online and offline users?
- What are the security risks of push-based webhook systems?
Related Concepts
Source: editorial — Synthesized from Twitter's engineering blog, Instagram's architecture talks, LinkedIn's feed architecture paper, and production patterns at Slack and Discord