SDI.
All Concepts
Systemsasyncdecouplingkafkarabbitmqsqspub-subevent-drivenexactly-once

Message Queues

Message queues decouple producers and consumers by buffering messages in a durable, ordered store, enabling asynchronous processing, load leveling, and fault-tolerant communication between services.

A message queue sits between a producer (sender) and a consumer (processor), storing messages until the consumer is ready to process them. This decouples services in time and space — the producer doesn't need to know who consumes, and the consumer doesn't need to be available when the message is sent. Key patterns are point-to-point (one consumer per message) and pub/sub (multiple subscribers). Major systems include Kafka (distributed log), RabbitMQ (AMQP broker), and SQS (managed AWS queue).

Tradeoffs

Strengths

  • Decoupling: Producers and consumers evolve independently; services can be deployed, scaled, and updated separately.
  • Resilience: Messages survive consumer crashes and are reprocessed on recovery.
  • Load leveling: Absorbs traffic spikes, protecting downstream services from overload.
  • Scalability: Kafka scales to millions of messages/sec with partitioning; SQS scales virtually without limits.
  • Replay: Kafka's log retention enables reprocessing historical events and bootstrapping new consumers.

Weaknesses

  • Added latency: Async processing means responses aren't immediate — not suitable for user-facing request paths that need instant results.
  • Complexity: Debugging distributed async flows is harder than tracing a synchronous call chain.
  • Ordering challenges: Maintaining global order requires sacrificing parallelism.
  • Exactly-once is hard: Most systems provide at-least-once, requiring idempotent consumers.
  • Operational overhead: Kafka clusters require careful tuning, monitoring, and capacity planning (except managed services).
  • Data consistency: Eventual consistency between services can create user-visible anomalies if not handled carefully.

Likely Follow-Up Questions

  • How would you ensure exactly-once processing in a system that writes to an external database?
  • What is the outbox pattern and when would you use it?
  • How do you handle message ordering when scaling consumers horizontally?
  • When would you choose Kafka over RabbitMQ?
  • How would you design a dead-letter queue strategy?
  • What is consumer lag and how do you monitor and respond to it?

Source: editorial — Synthesized from Apache Kafka documentation, RabbitMQ guides, AWS SQS documentation, and Uber/LinkedIn engineering blogs.

Command Palette

Search for a command to run...