Consensus Algorithms
How distributed nodes agree on a single value despite failures.
1The Jury Deliberation Analogy
Consensus is the process by which distributed nodes agree on a single value. Essential for replication, leader election, and distributed transactions.
2The Problem
FLP Impossibility
In an asynchronous system where even one process can crash, no deterministic consensus algorithm can guarantee termination.
Translation: Perfect consensus is theoretically impossible. Practical algorithms use timeouts and probability.
Safety
All nodes agree on the same value
Liveness
Eventually a decision is made
Fault Tolerance
Works despite some failures
3Paxos
Paxos (Leslie Lamport, 1989): The foundational consensus algorithm. Notoriously difficult to understand and implement correctly.
Basic Paxos Flow
4Raft
Raft (2014): Designed to be understandable. Used in etcd, Consul, CockroachDB. Easier to implement correctly than Paxos.
Leader
Handles all client requests, replicates to followers
Follower
Passive, responds to leader/candidate requests
Candidate
Tries to become leader during election
Raft Key Concepts
5Comparison
| Aspect | Paxos | Raft |
|---|---|---|
| Understandability | Difficult | Designed for clarity |
| Leader | Optional (Multi-Paxos) | Required (strong leader) |
| Log Ordering | Gaps possible | Contiguous |
| Used By | Google Chubby, Spanner | etcd, Consul, CockroachDB |
6Key Takeaways
?Quiz
1. 5-node Raft cluster. How many can fail and still make progress?
2. Why was Raft created?