Module 8 - Networking and APIs

Pagination Strategies

Return large datasets in manageable chunks without killing your server.

1The Library Analogy

Simple Analogy
A library has millions of books. You do not dump them all on the floor. You browse by shelf (offset), use a bookmark (cursor), or search by section (keyset). Pagination is the same: return data in pages, not all at once.

Pagination splits large result sets into smaller pages. Reduces memory, network, and processing overhead for both client and server.

2Offset Pagination

Example
GET /users?offset=20&limit=10

SELECT * FROM users ORDER BY id LIMIT 10 OFFSET 20
Pros
  • Simple to understand
  • Jump to any page
  • Easy to implement
  • Total count possible
Cons
  • Slow for large offsets (scans rows)
  • Inconsistent: inserts/deletes shift pages
  • O(offset) performance
Performance Problem

OFFSET 1000000 scans 1M rows to skip them. Gets slower as offset increases.

3Cursor Pagination

Example
GET /users?cursor=eyJpZCI6MTIzfQ&limit=10

cursor = base64 of id: 123
SELECT * FROM users WHERE id > 123 ORDER BY id LIMIT 10
Pros
  • Consistent: inserts do not affect position
  • O(1) performance (uses index)
  • Good for infinite scroll
  • Works with real-time data
Cons
  • Cannot jump to page N
  • Must traverse sequentially
  • More complex implementation

Cursor is an opaque token encoding the last item position. Client does not need to understand it, just pass it back.

4Keyset Pagination

Example
GET /users?after_id=123&limit=10

Explicit keyset (not encoded)
SELECT * FROM users WHERE id > 123 ORDER BY id LIMIT 10

Like cursor, but the key is visible and explicit. Works great when sorting by indexed column.

Multi-column Keyset

Sorting by created_at + id: WHERE (created_at, id) > (2024-01-15, 123). Needs composite index.

5Comparison

AspectOffsetCursorKeyset
PerformanceO(offset)O(1)O(1)
Jump to PageYesNoNo
ConsistentNoYesYes
ComplexitySimpleMediumMedium
Best ForSmall datasets, adminInfinite scrollAPI, feeds

6Real-World Examples

GitHub

Cursor (Link header)
Link header with rel next

Stripe

Cursor
starting_after=ch_123 and limit=10

Slack

Cursor
cursor=dXNlcjpVMDYxTk...

Twitter

Cursor (since_id/max_id)
since_id=123 and max_id=456

7Key Takeaways

1Offset: Simple but slow for large datasets. Use for admin/small tables.
2Cursor: Opaque token. Best for infinite scroll and real-time feeds.
3Keyset: Explicit last key. O(1) with proper index.
4Offset has O(offset) cost. Cursor/keyset have O(1).
5Most production APIs (Stripe, Slack, Twitter) use cursor pagination.

?Quiz

1. Feed with 10M posts, infinite scroll. Best pagination?

2. Why is OFFSET 1000000 slow?