File Storage (Dropbox)

Hard Storage

Overview

Designing a File Storage system (Dropbox, Google Drive) tests your understanding of sync, conflict resolution, and efficient storage. The core challenges include: keeping files in sync across devices, handling concurrent edits, deduplicating content (block-level or file-level), and providing a seamless experience for large files. This design matters in interviews because it combines object storage, metadata management, versioning, and real-time sync—and requires careful thinking about consistency (eventual vs strong) and conflict resolution. Companies like Dropbox and Google build these at exabyte scale, and demonstrating you understand chunking, delta sync, and how to minimize bandwidth shows you can design storage systems for the real world.

Requirements

Functional

  • Upload, download, delete files
  • Sync across multiple devices
  • Folder structure and sharing
  • Version history and restore
  • Conflict resolution (concurrent edits)
  • Search files

Non-Functional

  • Reliability — no data loss
  • Efficiency — dedup, delta sync to save bandwidth
  • Scalability — exabytes of storage
  • Low latency — sync within seconds

Capacity Estimation

Assume 500M users, 1TB avg per user = 500PB. 10M file ops/day. Block dedup can reduce storage 30-50%. Delta sync reduces bandwidth 70%+.

Architecture Diagram

ClientsSync ClientSync ServiceBlock StoreMetadata DBNotification Bus

Component Deep Dive

Sync Client

Monitors file changes, uploads blocks, downloads updates. Delta sync: only changed blocks. Conflict detection.

Block Store

Object store (S3). Stores content-addressable blocks. Dedup by hash. Handles large files via chunking.

Metadata Service

File/folder tree, version, block refs. PostgreSQL or distributed DB. Tracks what blocks each file uses.

Sync Service

Orchestrates sync. Receives client updates, updates metadata, notifies other clients. WebSocket or long poll.

Notification Service

Tells clients when files change (from other devices). Enables real-time sync.

Version Service

Stores file versions. Block refs + metadata. Enables restore.

Database Design

Metadata: file_id, path, user_id, block_refs[], version, modified_at. Blocks: hash → storage_url. Dedup via content hash. PostgreSQL for metadata; object store for blocks.

API Design

MethodPathDescription
POST/api/files/uploadUpload file blocks. Body: {path, blocks[]}. Returns version.
GET/api/files/downloadDownload file. Query: path. Returns block URLs.
GET/api/syncGet changes since cursor. Long poll or WebSocket.
POST/api/files/restoreRestore previous version.

Scalability & Trade-offs

Related System Designs