Module 10 - Specialized Databases

Time-Series Databases

Optimized for timestamped data-metrics, IoT, financial data, and logs.

1The Weather Station Analogy

Simple Analogy
A weather station records temperature every minute. You rarely update old readings-you just keep adding new ones. And you usually query "last 24 hours" or "average this week." Time-series databases are optimized exactly for this pattern: append-heavy, time-range queries.

Time-series database (TSDB) is optimized for data indexed by time. High write throughput, efficient time-range queries, and automatic data retention/downsampling.

2Characteristics

Append-Only Writes

Data is rarely updated or deleted. New data points are constantly added.

Time-Ordered

Primary index is timestamp. Queries are almost always time-range based.

High Write Volume

Millions of data points per second from sensors, servers, applications.

Downsampling

Old data aggregated: 1-second → 1-minute → 1-hour over time.

3Why Not Regular Databases?

AspectPostgreSQLTime-Series DB
Write Speed~10K/sec~1M/sec
CompressionGeneral purposeTime-aware (10-20x)
Time QueriesIndex requiredNative, optimized
RetentionManual deletionAutomatic policies
DownsamplingManual aggregationBuilt-in continuous

4Popular Time-Series Databases

InfluxDB

Purpose-built TSDB. InfluxQL and Flux query languages. Popular for DevOps.

High write performanceBuilt-in retention policiesTelegraf integration

TimescaleDB

PostgreSQL extension. Full SQL support. Best of both worlds.

SQL familiarityHypertables for auto-partitioningContinuous aggregates

Prometheus

Pull-based metrics. De facto for Kubernetes monitoring.

PromQL query languageBuilt-in alertingService discovery

ClickHouse

Column-oriented OLAP. Blazing fast aggregations.

SQL supportExtreme compressionReal-time analytics

5Common Use Cases

Infrastructure Monitoring

CPU, memory, disk usage from 10,000 servers every 10 seconds

IoT Sensor Data

Temperature, humidity from 1M sensors every second

Financial Data

Stock prices, trades at millisecond precision

Application Metrics

Request latency, error rates, throughput

Log Analytics

Aggregated log counts by service, error type

Real-time Analytics

Active users, page views, events per second

6Data Retention & Downsampling

Typical Retention Strategy
Last 7 daysRaw (1 second)100%
Last 30 days1 minute avg~1%
Last 1 year1 hour avg~0.01%
Forever1 day avg~0.001%

Downsampling aggregates old data to reduce storage. You lose precision but keep trends. 1 year of 1-second data = 31.5M points. Downsampled hourly = 8,760 points.

7Key Takeaways

1Time-series DBs are optimized for timestamped, append-only data
210-100x better write performance than general-purpose DBs
3Time-aware compression reduces storage 10-20x
4Downsampling aggregates old data to save space
5Use cases: metrics, IoT, financial data, logs

?Quiz

1. 10,000 servers sending CPU metrics every 10 seconds. Best DB?

2. What is downsampling?