Module 2 - Traffic & Load Management

Load Balancer Algorithms

The algorithm determines HOW incoming requests are distributed across servers. Choosing the right one can significantly impact performance and reliability.

10 min read6 Algorithms

1The Restaurant Host Analogy

Simple Analogy

Think of a restaurant with 5 tables and a host at the door. How does the host decide which table to seat the next customer?

• Round Robin: Table 1, then 2, then 3... repeat
• Least Connections: Seat at the table with fewest people
• Weighted: Big tables get more people than small ones
• IP Hash: Regulars always get their favorite table

Load Balancing Algorithm is the decision-making logic that determines which backend server receives each incoming request. The choice affects response times, resource utilization, and system reliability.

2Round Robin

The simplest algorithm. Requests are distributed sequentially to each server in a loop.

How Round Robin Works

Req 1 → A

Req 2 → B

Req 3 → C

Req 4 → A

Req 5 → B

Req 6 → C

Pseudocode

server_index = 0
servers = [A, B, C]

function get_next_server():
    server = servers[server_index]
    server_index = (server_index + 1) % len(servers)
    return server

✓ Pros

• Dead simple to implement
• No server state needed
• Fair distribution with equal servers
• Predictable behavior

✗ Cons

• Ignores server capacity differences
• Ignores current server load
• Can overload slow servers
• No session persistence

Best Use Case

Stateless services with identical servers and uniform request processing times. Perfect for read-heavy APIs.

3Weighted Round Robin

Like Round Robin, but servers with higher weights receive proportionally more requests.

Servers with Different Capacities

Server A

32 cores

×4

weight

Server B

16 cores

×2

weight

Server C

8 cores

×1

weight

Pattern: A → A → A → A → B → B → C → (repeat)

Pseudocode

servers = [
  {name: 'A', weight: 4},  // 4x traffic
  {name: 'B', weight: 2},  // 2x traffic  
  {name: 'C', weight: 1}   // 1x traffic
]

// Expand into: [A,A,A,A,B,B,C]
// Then round-robin through expanded list

When to Use

When you have servers with different hardware specs (CPU, RAM) or when some servers are dedicated to specific tasks.

4Least Connections

Routes each request to the server with the fewest active connections. Smart for varying request durations.

Dynamic Load Awareness

Server A

Server B ← NEW

Server C

New request → Server B (fewest connections)

✓ Pros

• Adapts to actual load
• Great for varying request times
• Handles slow requests well
• More even utilization

✗ Cons

• Requires tracking connections
• More complex than Round Robin
• May thrash with many short requests
• Slight overhead

Best Use Case

Long-lived connections (WebSockets), file uploads, database connections, or any workload with varying request durations.

5IP Hash (Sticky Sessions)

Uses a hash of the client's IP address to consistently route them to the same server. Enables session persistence.

Consistent Routing

192.168.1.100→hash(...) % 3 = 0→Server A

10.0.0.55→hash(...) % 3 = 2→Server C

192.168.1.100→hash(...) % 3 = 0→Server A (same!)

✓ Pros

• Session persistence without cookies
• Stateful app support
• Cache-friendly routing
• Simple to implement

✗ Cons

• Uneven distribution possible
• Server changes break sessions
• IP can change (mobile)
• NAT issues (many users → 1 IP)

Better Alternative

For session persistence, consider using external session storage (Redis) instead. This keeps your application stateless and more scalable.

6Least Response Time

Routes to the server with the fastest response time AND fewest connections. Best of both worlds.

Decision Factors

Server	Connections	Avg Response	Score
A	10	50ms	500
B ←	8	30ms	240
C	5	80ms	400

Score = Connections × Response Time (lower is better)

Best Use Case

Performance-critical applications where response time matters more than even distribution.

7Algorithm Comparison

Algorithm	Complexity	State	Best For
Round Robin	O(1)	Index only	Equal servers, uniform requests
Weighted RR	O(1)	Weights + Index	Different server capacities
Least Connections	O(n)	Connection counts	Varying request durations
IP Hash	O(1)	None	Session persistence
Least Response	O(n)	Times + Counts	Performance-critical apps

8Key Takeaways

1Round Robin is the default choice-simple and effective for most cases.

2Weighted Round Robin when servers have different capacities.

3Least Connections for workloads with varying request times (uploads, WebSockets).

4IP Hash for session stickiness, but prefer external session stores.

5Least Response Time for latency-sensitive applications.

6Start simple, measure, then optimize. Don't over-engineer.