Module 7 — Observability

Distributed Tracing

Follow a request across multiple services. Essential for debugging microservices.

1The Package Tracking Analogy

💡 Simple Analogy
When you ship a package, you get a tracking number. At each facility, they scan it:
  • • 9:00 AM - Picked up in NYC
  • • 2:00 PM - Sorting facility NJ
  • • 8:00 PM - Distribution center PA
  • • 10:00 AM next day - Delivered
Distributed tracing does this for requests. One trace ID follows the request across all services.

2Key Concepts

Trace
End-to-end journey of a request. Contains multiple spans.
Span
One unit of work (one service call). Has start/end time, tags, logs.
Trace ID
Unique ID for the entire request. Propagated across all services.
Span ID
Unique ID for each span. Parent-child relationships form a tree.
Context Propagation
Passing trace ID in HTTP headers (e.g., X-Trace-Id) between services.

3Visual: Trace Waterfall

Trace ID: abc123
API Gateway
150ms
└ User Service
50ms
└ Order Service
80ms
└ Database
30ms
└ Payment Service
40ms
0ms150ms

Waterfall view shows which service is slow and where time is spent

4Popular Tools

ToolTypeFeatures
JaegerOpen SourceCNCF, Kubernetes native, Uber origin
ZipkinOpen SourceTwitter origin, simple setup
Datadog APMSaaSFull platform, auto-instrumentation
AWS X-RayAWSAWS integrated, serverless support

5Implementation

Context Propagation (HTTP Headers)

Request Header: X-Trace-Id: abc123, X-Span-Id: span456
Each service reads header, creates child span, passes to downstream
OpenTelemetry

OpenTelemetry is becoming the standard for traces, metrics, and logs. It's vendor-neutral and supported by all major platforms.

6Key Takeaways

1Distributed tracing tracks requests across services.
2Trace ID follows the entire request. Span ID is one service call.
3Context propagation: pass trace ID in HTTP headers.
4Essential for debugging microservices latency and errors.
5Jaeger, Zipkin (open source) or Datadog (SaaS) are popular.
6OpenTelemetry is the vendor-neutral standard to adopt.