Module 1 — Data Storage

Choosing the Right Database

Decision framework for selecting databases based on your requirements.

1The Tool Analogy

💡 Simple Analogy
You wouldn't use a hammer to cut wood or a saw to drive nails. Different tools for different jobs.

Databases are the same: PostgreSQL for complex queries, Redis for caching,MongoDB for flexible documents, Cassandra for massive writes.

There's no "best" database—only the best for your specific use case.

2Decision Framework

Do you need ACID transactions?
Yes →SQL databases (PostgreSQL, MySQL)
No →Consider NoSQL for flexibility/scale
Is your schema well-defined and stable?
Yes →SQL databases
No →Document stores (MongoDB) for flexibility
Do you need complex queries with JOINs?
Yes →SQL databases
No →NoSQL can work with simple queries
Is write throughput critical (millions/sec)?
Yes →Wide-column (Cassandra) or Key-value (Redis)
No →Most databases will work
Do you need real-time, sub-millisecond reads?
Yes →In-memory (Redis, Memcached)
No →Disk-based databases are fine
Are you modeling relationships/graphs?
Yes →Graph database (Neo4j)
No →Relational or document stores

3Database by Use Case

E-commerce (orders, inventory)
PostgreSQL / MySQL
ACID for transactions, complex queries for reporting
Social media (posts, feeds)
PostgreSQL + Redis
SQL for relations, Redis for caching feeds
Real-time chat
Redis + PostgreSQL
Redis for presence/messages, SQL for history
IoT sensor data
TimescaleDB / InfluxDB
Time-series optimized, high write throughput
Content management
MongoDB
Flexible schema for varied content types
Session storage
Redis
Fast reads, built-in expiration
Search
Elasticsearch
Full-text search, relevance scoring
Recommendations
Neo4j + Redis
Graph for relationships, Redis for caching

4Quick Reference Table

DatabaseTypeBest ForAvoid When
PostgreSQLSQLComplex queries, JSON, GISNeed massive horizontal scale
MySQLSQLRead-heavy, web appsComplex JSON operations
MongoDBDocumentFlexible schema, rapid devComplex transactions
RedisKey-ValueCaching, sessions, queuesComplex queries, large data
CassandraWide-ColumnMassive writes, time-seriesAd-hoc queries, JOINs
DynamoDBKey-ValueServerless, auto-scaleComplex queries, cost-sensitive
Neo4jGraphRelationships, recommendationsSimple CRUD operations
ElasticsearchSearchFull-text search, logsPrimary data store

5Polyglot Persistence

Polyglot Persistence means using multiple databases in one system, each optimized for specific data types and access patterns.

Example: E-commerce Platform

PostgreSQL
Orders, users, products (ACID)
Redis
Sessions, cart, cache
Elasticsearch
Product search
S3
Product images
Complexity Trade-off

More databases = more operational complexity. Start simple (one SQL database), then add specialized databases as specific needs arise.

6Key Takeaways

1No "best" database—choose based on access patterns and requirements.
2Start with PostgreSQL for most applications—it handles 90% of cases.
3Add Redis when you need caching or fast ephemeral storage.
4Use specialized databases (Elasticsearch, Neo4j) for specific needs.
5Polyglot persistence: multiple databases for different data types.
6In interviews: explain WHY you chose a database, not just which one.