Module 1 - Data Storage

OLTP vs OLAP

Two fundamentally different database workload patterns. Transactions vs Analytics.

1The Store Analogy

Simple Analogy
OLTP (Cash Register): Fast, small transactions. Customer buys item, update inventory, record sale. Happens thousands of times per day, each operation is quick.

OLAP (Accountant's Desk): End of month, analyze all sales. Which products sold best? What's the trend? Complex queries across millions of records.

OLTP (Online Transaction Processing): Optimized for many small read/write transactions. Think: placing orders, updating profiles.

OLAP (Online Analytical Processing): Optimized for complex analytical queries on large datasets. Think: business intelligence, reporting.

2Key Differences

AspectOLTPOLAP
PurposeDay-to-day operationsAnalysis & reporting
QueriesSimple, by primary keyComplex aggregations
DataCurrent stateHistorical data
OperationsINSERT, UPDATE, DELETESELECT (mostly)
UsersMany concurrent usersFew analysts
Response TimeMillisecondsSeconds to minutes
SchemaNormalized (3NF)Denormalized (star/snowflake)

3OLTP Characteristics

High Concurrency

Thousands of users making small transactions simultaneously

ACID Compliance

Transactions must be atomic, consistent, isolated, durable

Normalized Schema

Minimize redundancy, optimize for writes

Row-Oriented Storage

Optimized for accessing entire rows

Examples: PostgreSQL, MySQL, Oracle, SQL Server for transactional workloads.

4OLAP Characteristics

Complex Queries

Aggregations, GROUP BY, window functions across billions of rows

Columnar Storage

Store data by column for efficient analytics scans

Denormalized Schema

Star/snowflake schema optimized for reads

Batch Updates

Data loaded periodically via ETL, not real-time

Examples: Snowflake, BigQuery, Redshift, ClickHouse, Apache Druid.

5Data Warehouse Architecture

Typical Flow

1OLTP databases (source systems)
2ETL (Extract, Transform, Load)
3Data Warehouse (OLAP)
4BI Tools (Tableau, Looker, etc.)

6Key Takeaways

1OLTP for transactions: fast writes, ACID, normalized schema
2OLAP for analytics: complex queries, columnar storage, denormalized
3Most systems use both: OLTP for operations, replicate to OLAP for analysis
4ETL pipelines move data from OLTP to data warehouse
5Don't run analytics on production OLTP-it will kill performance

?Quiz

1. A query needs to find total sales by region for the past year. Which system?

2. Which storage format is better for OLAP?