Search Autocomplete (Typeahead)

Overview

Search Autocomplete (Typeahead) is a popular system design question because it combines data structures (Trie), real-time ranking, and low-latency requirements. Companies like Google, Amazon, and LinkedIn rely on autocomplete to improve search UX, and the backend must return relevant suggestions within 50-100ms. The question tests your knowledge of Tries, caching, and how to rank suggestions (popularity, recency, personalization). It matters in interviews because it's a well-scoped problem that still has depth—handling millions of queries per second, updating the Trie with new trends, and supporting multiple languages and typo tolerance.

Requirements

Functional

Return top-K suggestions as user types
Rank by popularity, recency, or personalization
Support multiple languages
Handle typos (fuzzy matching optional)
Update suggestions based on search trends

Non-Functional

Low latency — <50ms p99
High throughput — millions of QPS
Freshness — reflect trending queries within minutes

Capacity Estimation

Assume 10M unique queries, 100K QPS. Trie size ~1GB. Cache hit rate 90%+.

Architecture Diagram

Component Deep Dive

API Gateway

Receives prefix, returns suggestions. May aggregate from multiple backends.

Trie Service

Stores prefix → top-K completions. In-memory Trie with cached top-K at each node. Sharded by first character.

Aggregation Service

Merges results from Trie, trending, personalization. Ranks and returns top-K.

Cache

Redis. Caches prefix → suggestions. High hit rate for common prefixes.

Offline Pipeline

Builds Trie from search logs. Computes popularity, updates Trie periodically.

Query Log Aggregator

Collects search queries, feeds into offline pipeline for trending.

Database Design

Trie built offline, loaded into memory. Query logs in Kafka → batch processed. Redis for cache. No traditional DB for real-time path; optional DB for metadata.

API Design

Method	Path	Description
`GET`	/api/suggest?q={prefix}&limit={k}	Get autocomplete suggestions. Returns {suggestions: []}.
`POST`	/api/feedback	Internal: Log search query for analytics (user selected suggestion).

Scalability & Trade-offs

Trie vs n-gram: Trie supports prefix efficiently; n-gram supports fuzzy matching. Hybrid possible.
Real-time vs batch: Real-time updates add complexity; batch updates (e.g., hourly) are simpler.
Personalization: Per-user Trie improves relevance but increases memory and complexity.

Related System Designs

Infrastructure