Module 1 — Data Storage
Caching Strategies
The fastest database query is the one you don't make. Learn when and how to cache.
1The Kitchen Analogy
Simple Analogy
Restaurant kitchen:
🍳
Counter (L1 Cache)
Ingredients you're using right now. Fastest access.
🗄️
Fridge (L2 Cache)
Common ingredients, quick to grab.
🏪
Grocery Store (Database)
Everything available but takes time to get.
2Why Cache?
Without Cache
• Every request hits database
• Database CPU at 80%+
• Response time: 200-500ms
• Limited to ~1K requests/sec
With Cache
• 90% requests from cache
• Database CPU at 10%
• Response time: 1-5ms
• Handle ~50K requests/sec
Typical Latency Comparison
L1 Cache (CPU)
1 ns
L2 Cache (CPU)
10 ns
Redis (in-memory)
0.5 ms
SSD Read
1 ms
Database Query
10-100 ms
Network API Call
100-500 ms
3Caching Patterns
There are several ways to implement caching. Each pattern suits different use cases:
1. Cache-Aside (Lazy Loading)
Application checks cache first. On miss, load from DB and populate cache.
App
→ 1. CheckCache
Miss!→ 2. QueryDB
→ 3. StoreCache
✓ Only caches data that's actually used
✓ Cache failure doesn't break app
✗ First request always slow (cache miss)
✗ Stale data if DB updated directly
def get_user(user_id):
# 1. Try cache first
user = cache.get(f"user:{user_id}")
if user:
return user
# 2. Cache miss - load from DB
user = db.query("SELECT * FROM users WHERE id = ?", user_id)
# 3. Store in cache for next time
cache.set(f"user:{user_id}", user, ttl=300) # 5 min TTL
return user2. Write-Through
Writes go through cache to database. Cache always has latest data.
App
→ WriteCache
→ Sync WriteDB
✓ Cache always consistent with DB
✓ No stale data problem
✗ Writes slower (two writes)
✗ May cache data never read
3. Write-Behind (Write-Back)
Writes to cache immediately, DB updated asynchronously later.
App
→ WriteCache
→ ⏱️ AsyncDB
✓ Fast writes (returns immediately)
✓ Good for write-heavy workloads
✗ Risk of data loss if cache fails
✗ Complex to implement correctly
4. Read-Through
Cache sits in front of DB. App only talks to cache. Cache loads from DB on miss.
App
→ ReadCache
→ (auto-loads)DB
✓ Simpler app code
✓ Cache handles loading logic
✗ Need cache library support
✗ First read still slow
4Pattern Comparison
| Pattern | Read Perf | Write Perf | Consistency | Best For |
|---|---|---|---|---|
| Cache-Aside | Fast (after warm) | Normal | Eventual | Read-heavy, tolerate stale |
| Write-Through | Fast | Slower | Strong | Need consistency |
| Write-Behind | Fast | Very Fast | Eventual | Write-heavy, can lose data |
| Read-Through | Fast (after warm) | Normal | Eventual | Simpler code |
5What to Cache
Good Candidates
Session Data
User info, authentication tokens
Computed Results
Expensive calculations, aggregations
Reference Data
Country codes, categories, config
API Responses
Third-party data, external services
Avoid Caching
Highly Dynamic Data
Stock prices, real-time analytics
Rarely Accessed Data
Historical records, archives
Large Blobs
Videos, large files (use CDN instead)
Sensitive Data
PII without proper security
6Cache Invalidation
"There are only two hard things in Computer Science: cache invalidation and naming things." — Phil Karlton
TTL (Time-To-Live)
Data expires after set time. Simplest approach.
cache.set('user:123', data, ttl=300) # Expires in 5 mins✓ Simple to implement
✓ Automatic cleanup
✗ Data can be stale until TTL
✗ Hard to pick right TTL
Event-Based Invalidation
Delete cache when data changes. Most accurate.
on_user_update(user): cache.delete(f'user:{user.id}')✓ Always accurate
✓ No stale data
✗ Need to track all writes
✗ Complex with multiple caches
Version/Tag-Based
Include version in key. Change version to invalidate all.
cache.get(f'user:{user_id}:v2') # Bump v2 to v3 to invalidate all✓ Easy bulk invalidation
✓ Works across instances
✗ Old versions linger
✗ Memory waste
7Key Takeaways
1Cache dramatically reduces latency and database load.
2Cache-Aside is most common: check cache → miss → load from DB → store in cache.
3Write-Through for consistency, Write-Behind for write performance.
4Cache what's read often, changes infrequently, and is expensive to compute.
5Invalidation is hard. Use TTL + event-based invalidation.
6Monitor hit rate: aim for 90%+ for effective caching.