Protocol Buffers & Serialization
How data is encoded for transmission-JSON vs binary formats.
1The Packaging Analogy
Serialization converts in-memory objects to bytes for transmission or storage. The format you choose affects speed, size, and compatibility.
2Common Formats
JSON
Text • 100%Pros
- ✓ Human readable
- ✓ Universal support
- ✓ Self-describing
Cons
- ✗ Verbose (field names repeated)
- ✗ Slower parsing
- ✗ No schema enforcement
Protocol Buffers
Binary • 30-50%Pros
- ✓ Compact (2-10x smaller)
- ✓ Fast serialization
- ✓ Strongly typed schema
Cons
- ✗ Not human readable
- ✗ Requires .proto files
- ✗ Schema evolution complexity
MessagePack
Binary • 50-70%Pros
- ✓ JSON-compatible
- ✓ Smaller than JSON
- ✓ No schema needed
Cons
- ✗ Still has field names
- ✗ Less compact than Protobuf
Avro
Binary • 40-60%Pros
- ✓ Schema evolution
- ✓ Compact
- ✓ Good for big data
Cons
- ✗ Schema required at read
- ✗ Less common
3JSON vs Protobuf Example
JSON (95 bytes)
{
"id": 12345,
"name": "John Doe",
"email": "john@example.com",
"age": 30,
"active": true
}Protobuf (~35 bytes)
// Schema (.proto file)
message User {
int32 id = 1;
string name = 2;
string email = 3;
int32 age = 4;
bool active = 5;
}
// Binary: field numbers + values
// No field names transmitted!3x smaller because Protobuf uses field numbers (1 byte) instead of field names ("email" = 5 bytes).
4Schema Evolution
Schemas change over time. How do you handle old clients reading new data?
Add Field
Yes - old clients ignore new fields. New clients get default if missing.
Use new field numbers. Never reuse deleted field numbers.Remove Field
Yes - mark as 'reserved'. Old data still has it, new code ignores.
reserved 3; // Don't reuse this numberRename Field
Yes - field number unchanged. Name is documentation only.
Change name in .proto, field number stays same.Change Type
Dangerous - int32 to string breaks compatibility.
Avoid. Create new field instead.5Performance Comparison
| Format | Size | Encode Speed | Decode Speed |
|---|---|---|---|
| JSON | 100% | 1x | 1x |
| Protobuf | 30-50% | 3-10x | 3-10x |
| MessagePack | 50-70% | 2-3x | 2-3x |
| Avro | 40-60% | 2-5x | 2-5x |
6When to Use What
JSON
Public APIs, web apps, debugging, config files
Universal support, human readable, easy debugging
Protobuf
gRPC, internal microservices, high-throughput systems
Smallest size, fastest speed, type safety
MessagePack
Redis caching, when you want smaller JSON
Drop-in JSON replacement, no schema needed
Avro
Kafka, big data pipelines, data lakes
Schema evolution, writer schema embedded
7Key Takeaways
?Quiz
1. High-throughput internal service needs smallest payload. Best choice?
2. You deleted field #5. What should you do?