Module 11 - Interview Prep
Estimation Techniques
Back-of-envelope math to size your system-quick and confident.
1Why Estimation Matters
Simple Analogy
Before building a bridge, engineers estimate the load it needs to carry. You wouldn't design a footbridge and then learn it needs to carry 18-wheelers. System design is the same-estimate traffic, storage, and bandwidth before choosing architecture.
Back-of-envelope estimation uses rough approximations to determine system scale. Precision isn't the goal-order of magnitude (10x) accuracy is.
2Numbers to Memorize
| Power of 2 | Exact Value | Approximation |
|---|---|---|
| 2^10 | 1,024 | ~1 Thousand (1 KB) |
| 2^20 | 1,048,576 | ~1 Million (1 MB) |
| 2^30 | 1,073,741,824 | ~1 Billion (1 GB) |
| 2^40 | 1.1 trillion | ~1 Trillion (1 TB) |
| Time | Seconds |
|---|---|
| 1 day | 86,400 ≈ 100,000 (10^5) |
| 1 month | 2.6M ≈ 2.5 × 10^6 |
| 1 year | 31.5M ≈ 30 × 10^6 |
3The Estimation Framework
1
Start with Users
DAU (daily active users) is your foundation. Everything derives from this.
2
Actions per User
How many reads/writes per user per day? Be specific.
3
Calculate QPS
QPS = (DAU × actions/user) / 86,400. Peak = 2-3x average.
4
Storage per Action
How much data per tweet, image, message? Include metadata.
5
Total Storage
Storage/year = actions/day × 365 × size/action
6
Bandwidth
Bandwidth = QPS × response size (for reads)
4Worked Example: Twitter
Given: 500M DAU, 300M tweets/day
Write QPS (tweets)
300M / 86,400 ≈ 3,500 QPS
Peak: ~10K QPS
Read QPS (feed views)
500M users × 10 views/day / 86,400
~60K QPS, Peak: ~180K
Tweet storage
280 chars + metadata ≈ 500 bytes
500 bytes/tweet
Daily storage
300M tweets × 500 bytes
~150 GB/day
Yearly storage (text)
150 GB × 365
~55 TB/year
With media (images)
50% have images, avg 500KB
+ 27 PB/year
5Common Gotchas
Forgetting Peak Load
Average is not enough. Peak can be 2-10x average.
Ignoring Read/Write Ratio
Most systems are read-heavy (100:1). Design for the dominant pattern.
Forgetting Replication
3x replication = 3x storage. Factor this into estimates.
Not Including Metadata
A 'tweet' is not just text. Include user info, timestamps, IDs.
Ignoring Growth
Design for 3-5 year growth. Current scale is not enough.
6Quick Reference Cheat Sheet
Text
- • Tweet: ~500 bytes
- • Message: ~1 KB
- • Email: ~50 KB
- • Article: ~100 KB
Media
- • Profile pic: ~50 KB
- • Photo: ~500 KB - 2 MB
- • Video minute: ~50 MB
- • Video (compressed): ~5 MB/min
Server Capacity
- • Web server: ~10K QPS
- • DB (writes): ~1K QPS
- • DB (reads): ~10K QPS
- • Redis: ~100K QPS
Network
- • Intra-datacenter: 1 Gbps+
- • Internet: 100 Mbps user
- • CDN: unlimited (effectively)
- • Mobile: 10-50 Mbps
7Key Takeaways
1Memorize powers of 2 and time conversions (86,400 sec/day)
2Start with DAU → actions/user → QPS → storage → bandwidth
3Peak = 2-3x average. Design for peak, not average.
4Show your work. Process matters more than exact numbers.
5Aim for order of magnitude accuracy, not precision.
?Quiz
1. 500M DAU, each user sends 2 messages/day. Average QPS?
2. 100M tweets/day, 500 bytes each. Daily storage?