Rate Limiter

Easy Infrastructure

Overview

A Rate Limiter system design question tests your understanding of distributed systems, consistency, and real-time decision-making under load. Companies like Stripe, Uber, and Netflix ask this because rate limiting is critical for API protection, fair usage, and cost control. The challenge lies in enforcing limits across distributed servers without a single point of failure, while handling edge cases like sliding windows and token buckets. This design matters in interviews because it combines algorithms (fixed window, sliding window, token bucket, leaky bucket) with systems concepts like Redis, distributed locks, and eventual consistency. Demonstrating knowledge of when to use local vs global limits and how to handle clock skew shows senior-level thinking.

Requirements

Functional

  • Limit number of requests per user/IP per time window
  • Support multiple algorithms: fixed window, sliding window, token bucket, leaky bucket
  • Return 429 Too Many Requests when limit exceeded
  • Include Retry-After header in 429 response
  • Support different limits per API endpoint or user tier
  • Allow whitelisting of certain clients (e.g., internal services)

Non-Functional

  • Low latency — rate limit check must add <10ms overhead
  • High availability — rate limiter failure should not block requests (fail open vs fail closed)
  • Accuracy — minimize false positives/negatives at window boundaries
  • Scalability — support millions of unique keys (user IDs, IPs)

Capacity Estimation

Assume 10M unique users, 1000 req/s per user peak. Need to check ~10K keys/s across distributed nodes. Redis can handle 100K+ ops/s.

Architecture Diagram

ClientsAPI GatewayRate MiddlewareRate ServiceRedis (Counters)Config Store

Component Deep Dive

Rate Limit Middleware

Intercepts each request before it reaches the application. Extracts user/key, calls Rate Limit Service, and returns 429 or forwards request.

Rate Limit Service

Implements the chosen algorithm (e.g., sliding window with Redis). Checks current count, increments if under limit, returns allow/deny.

Redis / Distributed Store

Stores key → (count, window_start) or token bucket state. Uses INCR, EXPIRE for fixed window; Lua scripts for sliding window atomicity.

Configuration Service

Stores limit rules per endpoint, user tier, or key prefix. Allows dynamic updates without redeployment.

Metrics & Alerts

Tracks rate limit hits, latency, and error rates. Alerts when limits are too aggressive or Redis is overloaded.

Database Design

Redis is the primary store: key = user_id:endpoint:window, value = count, TTL = window size. For sliding window log, use sorted set with timestamps. No traditional DB needed for core logic; config can live in config service or DB.

API Design

MethodPathDescription
GET/api/checkInternal: Check if request is allowed. Returns 200 + X-RateLimit-Remaining or 429.
GET/api/limitsGet current rate limit status for user (remaining, reset time).
POST/admin/limitsUpdate rate limit rules (admin only).

Scalability & Trade-offs

Related System Designs