Distributed Cache

Medium Storage

Overview

Designing a Distributed Cache (like Memcached or Redis Cluster) tests your understanding of caching strategies, consistency, and fault tolerance at scale. Caches sit in front of nearly every high-traffic system, and the question reveals how you think about eviction policies (LRU, LFU, TTL), partitioning (consistent hashing), and handling cache misses and stampedes. This design matters in interviews because it's a building block for many other designs—URL shortener, news feed, rate limiter—and the trade-offs (write-through vs write-back, cache-aside vs read-through) apply broadly. Showing you understand cache invalidation (one of the hard problems) and how to avoid thundering herd demonstrates practical systems knowledge.

Requirements

Functional

  • Get(key) and Set(key, value, ttl?)
  • Delete(key)
  • Support TTL and eviction (LRU, LFU)
  • Atomic operations (e.g., INCR)
  • Optional: pub/sub for invalidation

Non-Functional

  • Sub-millisecond latency
  • High throughput — millions of ops/sec
  • Fault tolerance — survive node failures
  • Scalability — add nodes without full rehash

Capacity Estimation

Assume 1TB cache, 1M QPS. 1000 nodes, 1GB each. Consistent hashing for minimal reshuffling.

Architecture Diagram

ClientsCache Client LibNode 1Node 2Node 3Cluster MgrReplication

Component Deep Dive

Cache Client

Library that hashes key to node, handles connection pooling, retries. Implements consistent hashing.

Cache Node

In-memory store. LRU/LFU eviction. Supports get, set, delete, TTL. Single-threaded or multi-threaded.

Cluster Manager

Membership, failure detection. Gossip protocol. Rebalance on node add/remove.

Replication (optional)

Master-replica for durability. Async replication. Read from replica for scale.

Persistence (optional)

RDB snapshots or AOF for durability. Trade-off: performance vs durability.

Database Design

Cache is in-memory; no traditional DB. Optional: persist to disk (Redis RDB/AOF). Metadata (cluster topology) in ZooKeeper/etcd.

API Design

MethodPathDescription
GET/get?key=Get value. Returns 200 + value or 404.
PUT/setSet key-value. Body: {key, value, ttl?}. Returns 200.
DELETE/delete?key=Delete key. Returns 200.

Scalability & Trade-offs

Related System Designs