Systemsfan-outfeedtimelinewebhookspollingreal-timewrite-heavyread-heavy

Push vs Pull Architecture

Push architectures precompute and deliver data to recipients ahead of time (fan-out-on-write), while pull architectures compute data on demand when requested (fan-out-on-read) — and the choice depends on your read-to-write ratio and follower distribution.

Push (fan-out-on-write): When a user posts content, the system immediately writes a copy to every follower's feed/inbox. Reads are instant (pre-materialized), but writes are expensive — a celebrity with 50M followers means 50M write operations per post.

Pull (fan-out-on-read): When a user opens their feed, the system queries all followed accounts' recent posts and merges them in real-time. Writes are cheap (store once), but reads are expensive — opening your feed requires querying hundreds of sources and sorting.

Twitter uses a hybrid: push for normal users (pre-materialized timelines), pull for celebrity accounts (Lady Gaga's tweets are fetched on-demand and merged into your timeline at read time). This avoids the write amplification of pushing to 50M feeds.

Tradeoffs

Push (Fan-Out-On-Write) — Strengths

Ultra-fast reads — data is pre-computed and waiting
Predictable read latency — no variable query costs
Real-time delivery — recipients get data immediately
Read path is trivially simple — just fetch from cache/inbox

Push — Weaknesses

Write amplification — one write becomes N writes (one per recipient)
Hot-user problem — celebrity accounts cause write storms
Wasted work — pre-computing feeds for users who may never read them
Harder to change ranking/filtering — pre-computed data can't easily incorporate real-time context

Pull (Fan-Out-On-Read) — Strengths

Minimal write cost — store once, compute on demand
Always fresh — computed at request time with latest data and context
Natural backpressure — consumers control their consumption rate
No wasted computation for inactive users

Pull — Weaknesses

Higher read latency — computing on demand takes time
Variable read cost — users following 1000 accounts vs 10 have very different query costs
Peak load sensitivity — many users opening feeds simultaneously creates query storms
Requires sophisticated caching to maintain acceptable latency

The Hybrid Approach

Most production systems use both: push for bounded-fanout scenarios (normal users, notifications), pull for unbounded-fanout (celebrities) and ML-ranked content. The threshold varies: Twitter uses ~10K followers, Instagram uses ML ranking as the tipping point.

Likely Follow-Up Questions

How did Twitter solve the celebrity fan-out problem?
When would you choose polling over webhooks despite the higher latency?
How does moving from a chronological to a ranked feed change the push/pull tradeoff?
What's write amplification and how does it affect push architectures at scale?
How would you design a notification system that works for both online and offline users?
What are the security risks of push-based webhook systems?

Related Concepts

Pub/Sub vs Message Queues Synchronous vs Asynchronous Communication Caching Content Delivery Networks Message Queues Database Replication

Source: editorial — Synthesized from Twitter's engineering blog, Instagram's architecture talks, LinkedIn's feed architecture paper, and production patterns at Slack and Discord