Mission Compile - System Design Learning Platform

1The Restaurant Analogy

Simple Analogy

Imagine you run a restaurant:

10 customers/day: You cook, serve, clean-one person handles everything
100 customers/day: Hire waiters, a cook. Specialize roles
1,000 customers/day: Multiple cooks, bigger kitchen, reservation system
10,000 customers/day: Open more locations, central supply chain

Each 10x growth requires rethinking architecture, not just working harder.

2Evolution of Architecture at Scale

Watch how architecture evolves as user count grows:

1

Single Server (0 - 1,000 users)

Starting Phase

👥

Users

→

Single ServerApp + DB

✓ Simple to build & deploy

✓ Easy to debug

✗ Single point of failure

2

Separate Database (1K - 100K users)

Growth Phase

👥

Users

→

App Server

→

Database

Key Changes:

• Separate DB allows independent scaling
• Can upgrade app server without touching data
• Add caching layer (Redis) between app and DB

3

Load Balancing (100K - 1M users)

Scale Phase

👥

→

Load
Balancer

→

App 1

App 2

App 3

→

Primary
DB

Key Changes:

• Multiple app servers behind load balancer
• App servers must be stateless (session in Redis)
• Database becomes the bottleneck

4

Database Scaling (1M - 10M users)

Advanced Scale

LB

↓

Apps

→

Cache
Redis

→

Primary

R1

R2

Key Changes:

• Read replicas for read-heavy workloads
• Caching layer reduces DB load by 90%+
• Consider sharding for write scaling
• Async processing with message queues

5

Microservices (10M+ users)

Enterprise Scale

API Gateway

Users

Orders

Payments

Inventory

Search

Analytics

Notifications

Auth

Key Changes:

• Independent services with own databases
• Teams can deploy independently
• Event-driven communication
• Requires strong DevOps culture

3Key Scaling Principles

Identify the Bottleneck

At any point, ONE thing is the limiting factor. Find it, fix it, find the next one.

Example: CPU maxed? Add servers. DB slow? Add cache or replicas.

Stateless Services

Any server can handle any request. Store state externally.

Example: Sessions in Redis, files in S3, not on local disk.

Cache Aggressively

The fastest query is the one you don't make.

Example: 90% of reads can often be served from cache.

Async Everything Possible

If it doesn't need to happen NOW, queue it.

Example: Emails, notifications, analytics-all can be async.

Partition Data

When one DB isn't enough, split by user_id or region.

Example: Users A-M on shard 1, N-Z on shard 2.

Measure, Don't Guess

Instrument everything. Data-driven decisions only.

Example: Don't optimize code that only runs 0.1% of the time.

4Common Bottlenecks & Solutions

!

Single Server CPU Maxed

Symptoms: High CPU, slow response times

→ Vertical scale (bigger server)→ Horizontal scale (more servers + LB)→ Optimize hot code paths

!

Database Reads Too Slow

Symptoms: High DB CPU, slow queries, timeouts

→ Add read replicas→ Implement caching (Redis)→ Optimize queries & indexes

!

Database Writes Can\'t Keep Up

Symptoms: Write queue growing, replication lag

→ Shard the database→ Use async writes/queues→ Batch writes together

!

Network Bandwidth Limit

Symptoms: High latency, packet loss

→ Use CDN for static content→ Compress responses (gzip)→ Edge computing

!

External API Rate Limits

Symptoms: 429 errors, timeouts

→ Cache API responses→ Circuit breaker pattern→ Request batching

5The 10x Rule

Design for 10x, Not 100x

Under-engineered

System crashes at 2x load

Technical debt, outages

Just Right

Handles 10x, path to 100x clear

Balanced complexity

Over-engineered

Built for Google scale, 100 users

Wasted time, complexity

Interview Strategy

Start simple, then scale. Walk through: "At 1K users we'd have X. At 100K, we'd add Y. At 1M, we'd need Z." Show you understand the evolution, not just the final state.

6Key Takeaways

1Scale changes everything-what works at 100 users breaks at 1M.

2Evolution: Single server → Separate DB → Load balancing → DB scaling → Microservices

3Find the bottleneck, fix it, find the next one. Iterative approach.

4Stateless + external state enables horizontal scaling.

5Cache and async are your best friends at scale.

6Design for 10x, not 100x. Avoid premature optimization.

7Interview Follow-up Questions

Interview Follow-up Questions

Common follow-up questions interviewers ask

8Test Your Understanding

Test Your Understanding

5 questions

1

Which is typically the FIRST bottleneck as a simple web application grows?

2

What is the main prerequisite for horizontal scaling of application servers?

3

You add a caching layer and database load drops from 90% to 30%. What's the likely new bottleneck?

4

What's wrong with designing for 100x scale from day one?

5

Which statement about the scaling evolution is TRUE?

0 of 5 answered

The Scaling Mindset

1The Restaurant Analogy

2Evolution of Architecture at Scale

Single Server (0 - 1,000 users)

Separate Database (1K - 100K users)

Load Balancing (100K - 1M users)

Database Scaling (1M - 10M users)

Microservices (10M+ users)

3Key Scaling Principles

4Common Bottlenecks & Solutions

5The 10x Rule

Design for 10x, Not 100x

6Key Takeaways

7Interview Follow-up Questions

Interview Follow-up Questions

8Test Your Understanding

Test Your Understanding