Search Autocomplete (Typeahead)
Overview
Search Autocomplete (Typeahead) is a popular system design question because it combines data structures (Trie), real-time ranking, and low-latency requirements. Companies like Google, Amazon, and LinkedIn rely on autocomplete to improve search UX, and the backend must return relevant suggestions within 50-100ms. The question tests your knowledge of Tries, caching, and how to rank suggestions (popularity, recency, personalization). It matters in interviews because it's a well-scoped problem that still has depth—handling millions of queries per second, updating the Trie with new trends, and supporting multiple languages and typo tolerance.
Requirements
Functional
- Return top-K suggestions as user types
- Rank by popularity, recency, or personalization
- Support multiple languages
- Handle typos (fuzzy matching optional)
- Update suggestions based on search trends
Non-Functional
- Low latency — <50ms p99
- High throughput — millions of QPS
- Freshness — reflect trending queries within minutes
Capacity Estimation
Assume 10M unique queries, 100K QPS. Trie size ~1GB. Cache hit rate 90%+.
Architecture Diagram
Component Deep Dive
API Gateway
Receives prefix, returns suggestions. May aggregate from multiple backends.
Trie Service
Stores prefix → top-K completions. In-memory Trie with cached top-K at each node. Sharded by first character.
Aggregation Service
Merges results from Trie, trending, personalization. Ranks and returns top-K.
Cache
Redis. Caches prefix → suggestions. High hit rate for common prefixes.
Offline Pipeline
Builds Trie from search logs. Computes popularity, updates Trie periodically.
Query Log Aggregator
Collects search queries, feeds into offline pipeline for trending.
Database Design
Trie built offline, loaded into memory. Query logs in Kafka → batch processed. Redis for cache. No traditional DB for real-time path; optional DB for metadata.
API Design
| Method | Path | Description |
|---|---|---|
GET | /api/suggest?q={prefix}&limit={k} | Get autocomplete suggestions. Returns {suggestions: []}. |
POST | /api/feedback | Internal: Log search query for analytics (user selected suggestion). |
Scalability & Trade-offs
- Trie vs n-gram: Trie supports prefix efficiently; n-gram supports fuzzy matching. Hybrid possible.
- Real-time vs batch: Real-time updates add complexity; batch updates (e.g., hourly) are simpler.
- Personalization: Per-user Trie improves relevance but increases memory and complexity.
Related System Designs
URL Shortener (TinyURL)
The URL Shortener (TinyURL-style) system design is a classic interview question that tests your understanding of distrib...
InfrastructureRate Limiter
A Rate Limiter system design question tests your understanding of distributed systems, consistency, and real-time decisi...
StorageKey-Value Store
Designing a distributed Key-Value Store (like Redis or DynamoDB) is a staple system design question at companies buildin...