Module 4 - Scaling

Database Replication

Keeping copies of your database for availability and read scaling.

1The Backup Band Analogy

Simple Analogy

A lead singer (primary) performs while backup singers (replicas) follow along. If the lead can't perform, a backup can step in. Multiple backups can also handle different venues (read requests) simultaneously.

Database replication copies data from one database (primary/leader) to others (replicas/followers) for redundancy and read scaling.

2Replication Types

Synchronous

Primary waits for replica to confirm write. Strong consistency but higher latency.

+ No data loss- Slower writes

Asynchronous

Primary writes immediately, replica catches up later. Risk of data loss on failure.

+ Fast writes- Potential data loss

Semi-synchronous

Wait for at least one replica. Balance of safety and speed.

+ Balanced- More complex

3Architectures

Single-Leader

One primary handles writes, replicas handle reads. Most common.

Multi-Leader

Multiple primaries accept writes. Complex conflict resolution.

Leaderless

Any node accepts writes. Quorum-based reads/writes (Dynamo-style).

Chain Replication

Writes go through chain of nodes. Strong consistency.

4Replication Lag

Replication lag is the delay between a write on primary and its appearance on replicas. Can cause stale reads.

Read-Your-Writes

User writes, then reads from replica that hasn't caught up. Solution: route user's reads to primary.

Monotonic Reads

User sees new data, then old data on different replica. Solution: sticky sessions to same replica.

5Key Takeaways

1Replication copies data to multiple nodes for HA and read scaling

2Synchronous = safe but slow; Async = fast but risky

3Single-leader is most common-one primary, many replicas

4Replication lag causes stale reads-handle with read-your-writes

5Use replicas to scale reads, not writes

?Quiz

1. Synchronous replication trades what for data safety?

2. User writes a comment, refreshes, doesn't see it. Cause?