LSM Trees vs B-Trees
Two fundamental data structures that power modern databases.
1The Notebook Analogy
LSM Tree (Append-Only Notebook): Just write notes at the end. Super fast to add new entries. But finding old notes requires checking multiple notebooks and merging them periodically.
2B-Trees: Read-Optimized
B-Tree: A self-balancing tree where each node can have multiple children. Used by most traditional databases (PostgreSQL, MySQL InnoDB, Oracle).
Fast Reads
O(log n) lookups. Data is always organized for quick access.
In-Place Updates
Updates modify data directly on disk.
Predictable Performance
Consistent latency for both reads and writes.
Space Efficient
No duplicate data, minimal storage overhead.
Downside: Random I/O for writes. Each write may touch multiple pages across disk.
3LSM Trees: Write-Optimized
LSM Tree (Log-Structured Merge Tree): Writes go to an in-memory buffer (memtable), then flush to sorted files on disk (SSTables). Used by Cassandra, RocksDB, LevelDB, InfluxDB.
Fast Writes
Sequential I/O-just append to memory and flush to disk.
High Throughput
Write-heavy workloads are significantly faster.
Compression
SSTables compress well, saving disk space.
Time-Series Friendly
Perfect for append-heavy workloads like logs and metrics.
Downside: Reads may check multiple levels. Write amplification from compaction.
4How LSM Works
5Comparison
| Aspect | B-Tree | LSM Tree |
|---|---|---|
| Writes | Random I/O | Sequential I/O (faster) |
| Reads | Single lookup | May check multiple levels |
| Space | Fixed overhead | Temp duplicates, better compression |
| Best For | Read-heavy OLTP | Write-heavy, time-series |
| Examples | PostgreSQL, MySQL | Cassandra, RocksDB |
6Key Takeaways
?Quiz
1. A logging system writes millions of events per second with rare reads. Best choice?
2. What structure does LSM use to avoid checking all SSTables on reads?