Distributed Cache
Overview
Designing a Distributed Cache (like Memcached or Redis Cluster) tests your understanding of caching strategies, consistency, and fault tolerance at scale. Caches sit in front of nearly every high-traffic system, and the question reveals how you think about eviction policies (LRU, LFU, TTL), partitioning (consistent hashing), and handling cache misses and stampedes. This design matters in interviews because it's a building block for many other designs—URL shortener, news feed, rate limiter—and the trade-offs (write-through vs write-back, cache-aside vs read-through) apply broadly. Showing you understand cache invalidation (one of the hard problems) and how to avoid thundering herd demonstrates practical systems knowledge.
Requirements
Functional
- Get(key) and Set(key, value, ttl?)
- Delete(key)
- Support TTL and eviction (LRU, LFU)
- Atomic operations (e.g., INCR)
- Optional: pub/sub for invalidation
Non-Functional
- Sub-millisecond latency
- High throughput — millions of ops/sec
- Fault tolerance — survive node failures
- Scalability — add nodes without full rehash
Capacity Estimation
Assume 1TB cache, 1M QPS. 1000 nodes, 1GB each. Consistent hashing for minimal reshuffling.
Architecture Diagram
Component Deep Dive
Cache Client
Library that hashes key to node, handles connection pooling, retries. Implements consistent hashing.
Cache Node
In-memory store. LRU/LFU eviction. Supports get, set, delete, TTL. Single-threaded or multi-threaded.
Cluster Manager
Membership, failure detection. Gossip protocol. Rebalance on node add/remove.
Replication (optional)
Master-replica for durability. Async replication. Read from replica for scale.
Persistence (optional)
RDB snapshots or AOF for durability. Trade-off: performance vs durability.
Database Design
Cache is in-memory; no traditional DB. Optional: persist to disk (Redis RDB/AOF). Metadata (cluster topology) in ZooKeeper/etcd.
API Design
| Method | Path | Description |
|---|---|---|
GET | /get?key= | Get value. Returns 200 + value or 404. |
PUT | /set | Set key-value. Body: {key, value, ttl?}. Returns 200. |
DELETE | /delete?key= | Delete key. Returns 200. |
Scalability & Trade-offs
- Write-through vs write-back: Write-through is simpler, always consistent; write-back is faster but risks data loss.
- Cache-aside vs read-through: Cache-aside gives app control; read-through simplifies app logic.
- Consistent hashing: Minimizes data movement on scale; virtual nodes reduce load imbalance.
Related System Designs
Key-Value Store
Designing a distributed Key-Value Store (like Redis or DynamoDB) is a staple system design question at companies buildin...
StorageFile Storage (Dropbox)
Designing a File Storage system (Dropbox, Google Drive) tests your understanding of sync, conflict resolution, and effic...
InfrastructureURL Shortener (TinyURL)
The URL Shortener (TinyURL-style) system design is a classic interview question that tests your understanding of distrib...