SDI.
All Concepts
APIthrottlingtoken-bucketsliding-windowapi-protectionDDoS429redis

Rate Limiting

Rate limiting controls the number of requests a client can make to a service within a time window, protecting against abuse, ensuring fair usage, and preventing resource exhaustion.

Rate limiting restricts how many requests a client (identified by IP, API key, or user ID) can make in a given time window. When the limit is exceeded, the server returns HTTP 429 Too Many Requests. Common algorithms are token bucket (smooth, bursty-friendly), sliding window (accurate counts), and fixed window (simple but edge-burst vulnerable). Rate limiters are typically implemented in the API gateway or a middleware layer using Redis for distributed state.

Tradeoffs

Strengths

  • Protection: Prevents abuse, brute-force attacks, and accidental resource exhaustion.
  • Fairness: Ensures no single client monopolizes shared resources.
  • Cost control: Limits expensive downstream calls (third-party APIs, database queries).
  • Simplicity: Token bucket and sliding window algorithms are straightforward to implement.
  • Monetization: Different rate limits for different pricing tiers is a proven business model.

Weaknesses

  • User experience: Legitimate users hitting rate limits is frustrating, especially if limits are too aggressive.
  • Distributed accuracy: Maintaining exact global counts across multiple servers adds latency and complexity.
  • Configuration complexity: Choosing the right limits requires traffic analysis and continuous tuning.
  • Circumvention: Sophisticated attackers can distribute requests across IPs/accounts to evade per-client limits.
  • Clock skew: Time-window-based algorithms can behave inconsistently if server clocks are not synchronized.
  • Legitimate burst handling: Strict rate limits can reject valid traffic spikes (e.g., a marketing campaign launch).

Likely Follow-Up Questions

  • How would you implement distributed rate limiting across multiple data centers?
  • What is the difference between token bucket and leaky bucket?
  • How do you rate limit in a microservices architecture where requests fan out?
  • How would you handle rate limiting for WebSocket connections?
  • What is the fixed-window boundary burst problem and how do sliding windows solve it?
  • How do you choose rate limits for a new API?

Source: editorial — Synthesized from Stripe/GitHub/Cloudflare API documentation, Redis rate limiting patterns, and IETF rate limiting RFCs.

Command Palette

Search for a command to run...