Leader Election
Choosing one node to coordinate when multiple can do the job.
1The Team Captain Analogy
Leader election is a process where distributed nodes agree on exactly one node to act as leader. The leader coordinates actions that require single authority.
2Why Elect a Leader?
Single Writer
Only leader writes to DB. Avoids write conflicts.
Task Coordination
Leader assigns work to followers.
Cron Jobs
Only leader runs scheduled tasks.
Consensus
Leader proposes values in Raft/Paxos.
3Election Approaches
Bully Algorithm
Highest ID wins. Node detects leader failure, starts election, higher IDs take over.
Ring Algorithm
Nodes arranged in ring. Election message travels around, collecting votes.
Raft/Paxos
Consensus-based. Nodes vote, majority wins. Production-grade.
Zookeeper/etcd
Use coordination service. Nodes race to create ephemeral node.
4Real-World Dry Run: Zookeeper Election
Scenario: 3 Kafka brokers electing controller
5Split-Brain Problem
Split-brain: Network partition causes two nodes to both think they're leader. Dangerous! Can cause data corruption.
Fencing Tokens
Each leader gets a monotonically increasing token. Storage rejects writes with older tokens.
Quorum
Require majority vote to be leader. Both sides of partition can't have majority.
Lease Timeout
Leader must renew lease. If partition, old leader's lease expires.
STONITH
Shoot The Other Node In The Head. Force-kill suspected failed leader.
6Key Takeaways
?Quiz
1. Network partition: old leader on one side, new leader elected on other. What's this called?
2. How do fencing tokens prevent split-brain damage?