Module 1 β€” Data Storage

Read vs Write Patterns

Understanding your read/write ratio is key to choosing the right architecture.

1The Library vs Newspaper Analogy

πŸ’‘ Simple Analogy
Read-Heavy (Library): Books are written once and read thousands of times. Optimize for finding and readingβ€”organize by category, add indexes, make copies.

Write-Heavy (Newspaper): New content constantly arriving, needs to be recorded fast. Readers can wait a bit for today's news to be organized.

Your database needs match your use case!

2Read-Heavy Workloads

Read-Heavy systems have many more reads than writes (ratio > 10:1). Examples: social media feeds, product catalogs, content sites.

Optimization Strategies

Caching

Store frequently read data in memory (Redis, Memcached). Avoid hitting DB.

Impact: Can serve 100x more reads from cache than DB
Read Replicas

Copy data to multiple read-only databases. Distribute read load.

Impact: Scale reads horizontally by adding replicas
Denormalization

Duplicate data to avoid JOINs. Trade storage for speed.

Impact: Single query instead of multiple JOINs
CDN

Cache static content at edge locations worldwide.

Impact: Sub-100ms response for global users
Read-Optimized Architecture
Client
β†’
Cache
90% hit
β†’
Replica
Replica
Read replicas

3Write-Heavy Workloads

Write-Heavy systems have frequent writes (ratio < 3:1 or more writes than reads). Examples: logging, IoT sensors, analytics events, messaging.

Optimization Strategies

Write-Behind Cache

Buffer writes in memory, batch flush to database.

Impact: Reduce DB write operations by 90%+
Append-Only Logs

Sequential writes are faster than random. Log-structured storage.

Impact: 10x faster writes than random updates
Sharding

Distribute writes across multiple database servers.

Impact: Linear write scaling with more shards
Async Processing

Accept write, queue for processing, return immediately.

Impact: Consistent response time regardless of load
Write-Optimized Architecture
Client
β†’
Queue
Buffer
β†’
Shard 1
Shard 2
Distributed writes

4Read/Write Ratio Examples

Twitter Timeline
R: 1000
W: 1
Tweets read millions of times
E-commerce Catalog
R: 100
W: 1
Products viewed, rarely updated
Banking Transactions
R: 10
W: 1
Check balance, make payments
Chat Messaging
R: 3
W: 1
Send and read messages
IoT Sensor Data
R: 1
W: 100
Constant writes, occasional analysis
Logging System
R: 1
W: 1000
Write everything, query rarely

5Balancing Reads and Writes

TechniqueHelps ReadsHelps WritesTrade-off
Cachingβœ“βœ“βœ“β€”Cache invalidation complexity
Read Replicasβœ“βœ“β€”Replication lag
Shardingβœ“βœ“βœ“βœ“Cross-shard queries complex
Denormalizationβœ“βœ“βœ—Data duplication, sync issues
Async Writesβ€”βœ“βœ“Eventual consistency

6Interview Questions

Q: "Design a URL shortener"
Analysis: Read-heavy: URLs created once, clicked millions of times. Use caching + read replicas.
Q: "Design a logging system"
Analysis: Write-heavy: Millions of log entries/sec, queried occasionally. Use append-only storage, time-based sharding.
Q: "Design a messaging app"
Analysis: Balanced but leans write: Users both send and read messages. Use write-optimized storage with read caching.

7Key Takeaways

1Know your read/write ratioβ€”it drives architecture decisions.
2Read-heavy: Cache aggressively, use read replicas, denormalize.
3Write-heavy: Batch writes, use append-only logs, shard data.
4Most web apps are read-heavy (10:1 to 1000:1).
5In interviews, ask: "What's the expected read/write ratio?"
6Optimize for the common case, but handle the uncommon gracefully.