The Scaling Mindset
How to think about systems that grow from 100 to 100 million users. The mental models behind scale.
1The Restaurant Analogy
- 10 customers/day: You cook, serve, clean-one person handles everything
- 100 customers/day: Hire waiters, a cook. Specialize roles
- 1,000 customers/day: Multiple cooks, bigger kitchen, reservation system
- 10,000 customers/day: Open more locations, central supply chain
Each 10x growth requires rethinking architecture, not just working harder.
2Evolution of Architecture at Scale
Watch how architecture evolves as user count grows:
Single Server (0 - 1,000 users)
Starting PhaseSeparate Database (1K - 100K users)
Growth Phase- • Separate DB allows independent scaling
- • Can upgrade app server without touching data
- • Add caching layer (Redis) between app and DB
Load Balancing (100K - 1M users)
Scale PhaseBalancer
DB
- • Multiple app servers behind load balancer
- • App servers must be stateless (session in Redis)
- • Database becomes the bottleneck
Database Scaling (1M - 10M users)
Advanced ScaleRedis
- • Read replicas for read-heavy workloads
- • Caching layer reduces DB load by 90%+
- • Consider sharding for write scaling
- • Async processing with message queues
Microservices (10M+ users)
Enterprise Scale- • Independent services with own databases
- • Teams can deploy independently
- • Event-driven communication
- • Requires strong DevOps culture
3Key Scaling Principles
At any point, ONE thing is the limiting factor. Find it, fix it, find the next one.
Example: CPU maxed? Add servers. DB slow? Add cache or replicas.
Any server can handle any request. Store state externally.
Example: Sessions in Redis, files in S3, not on local disk.
The fastest query is the one you don't make.
Example: 90% of reads can often be served from cache.
If it doesn't need to happen NOW, queue it.
Example: Emails, notifications, analytics-all can be async.
When one DB isn't enough, split by user_id or region.
Example: Users A-M on shard 1, N-Z on shard 2.
Instrument everything. Data-driven decisions only.
Example: Don't optimize code that only runs 0.1% of the time.
4Common Bottlenecks & Solutions
5The 10x Rule
Design for 10x, Not 100x
6Key Takeaways
7Interview Follow-up Questions
Interview Follow-up Questions
Common follow-up questions interviewers ask
8Test Your Understanding
Test Your Understanding
5 questions
Which is typically the FIRST bottleneck as a simple web application grows?
What is the main prerequisite for horizontal scaling of application servers?
You add a caching layer and database load drops from 90% to 30%. What's the likely new bottleneck?
What's wrong with designing for 100x scale from day one?
Which statement about the scaling evolution is TRUE?