system design cheatsheet


← back to notes

When estimating the capacity of a system, think about:

Useful Heuristics

Latency Numbers

Operation Latency Notes
L1 cache reference 0.5 ns ~4 cycles
Branch mispredict 3 ns ~10-15 cycles
L2 cache reference 4 ns ~12 cycles
L3 cache reference 10 ns ~40 cycles
Mutex lock/unlock (uncontended) 17 ns
Main memory reference 100 ns 100× slower than L1
Compress 1 KB (Snappy/Zippy) 10 µs
Read 4 KB randomly from NVMe SSD 5 µs ~200K IOPS
Read 1 MB sequentially from memory 25 µs DDR5 ~50 GB/s
Read 1 MB sequentially from NVMe SSD 150 µs ~7 GB/s (Gen4/5)
Datacenter RTT (same zone) 500 µs ~2,000 RTT/s possible
Read 1 MB from 1 Gbps network 10 ms 1 Gbps = 125 MB/s
HDD seek 10 ms
Read 1 MB sequentially from HDD 30 ms ~30 MB/s
Cross-region datacenter RTT 50–150 ms ~7 RTT/s worldwide
Round trip CA ↔ Netherlands 150 ms Speed of light limit

Key formulas:

Total latency (serial operations):
Total = Latency₁ + Latency₂ + Latency₃ + ...

Total latency (parallel operations):
Total = max(Latency₁, Latency₂, Latency₃, ...)

Network latency estimation:
Min RTT ≈ Distance (km) × 2 ÷ 200,000 km/s (speed of light in fiber)
Example: NYC to London (5,500 km) = 5,500 × 2 ÷ 200,000 = 55 ms minimum

Percentile latencies (rule of thumb):
p50 ≈ median latency
p99 ≈ 2-10× p50 (depends on system)
p99.9 ≈ 2-5× p99

Latency budget allocation:
If target p99 = 200ms, allocate:
- Network: 20-30%
- Application: 30-40%
- Database: 30-40%
- Buffer: 10-20%

Availability & Downtime

Availability Downtime/Year Downtime/Month Downtime/Week
90% (1 nine) 36.5 days 3 days 16.8 hours
95% 18.25 days 1.5 days 8.4 hours
99% (2 nines) 3.65 days 7.3 hours 1.7 hours
99.9% (3 nines) 8.76 hours 43.8 minutes 10 minutes
99.99% (4 nines) 52.6 minutes 4.4 minutes 1 minute
99.999% (5 nines) 5.26 minutes 26 seconds 6 seconds
99.9999% (6 nines) 31.5 seconds 2.6 seconds 0.6 seconds

Key formulas:

Downtime = (1 - Availability) × Time Period

Downtime/year  = (1 - availability) × 365 × 24 × 60 minutes
Downtime/month = (1 - availability) × 30 × 24 × 60 minutes
Downtime/week  = (1 - availability) × 7 × 24 × 60 minutes

Combined availability (serial):   A_total = A₁ × A₂ × A₃ × ...
Combined availability (parallel): A_total = 1 - (1 - A₁) × (1 - A₂) × ...

Example (serial): Two services at 99.9% each → 0.999 × 0.999 = 99.8%
Example (parallel): Two services at 99% each → 1 - (0.01 × 0.01) = 99.99%

Traffic Estimation (Requests ↔ RPS)

Key formulas:

Requests/Month Requests/Day Requests/Second (RPS)
1 million 33K ~0.4
2.5 million 83K ~1
10 million 333K ~4
100 million 3.3M ~40
1 billion 33M ~400
10 billion 333M ~4,000
100 billion 3.3B ~40,000

Quick conversion:


Network & Bandwidth

Context Typical Bandwidth
Within datacenter 10–100 Gbps
Between datacenters 10 Mbps – 1 Gbps
Home internet 100 Mbps – 1 Gbps

Bits vs Bytes conversion:
Network speeds are in bits/s, storage throughput is in bytes/s.
Divide by 8: 1 Gbps = 125 MB/s

Network Speed Data Throughput
100 Mbps 12.5 MB/s
1 Gbps 125 MB/s
10 Gbps 1.25 GB/s
100 Gbps 12.5 GB/s

Key formulas:

Throughput (bytes/s) = Network speed (bits/s) ÷ 8

Transfer time = Data size ÷ Throughput
Example: 1 GB over 1 Gbps = 1024 MB ÷ 125 MB/s ≈ 8 seconds

Bandwidth needed = Data size × Requests per second
Example: 1 MB responses × 1000 RPS = 1 GB/s = 8 Gbps

With compression (typical 3× reduction):
Effective bandwidth = Raw bandwidth × Compression ratio
Example: 1 Gbps × 3 = 3 Gbps effective for compressible data

Quick Mental Math Rules

Rule Value
Seconds in a day 86,400 ≈ 100K
Seconds in a month ~2.5 million
Seconds in a year ~31.5 million ≈ π × 10⁷
1 million req/day → RPS ~12
1 billion req/month → RPS ~400

Key formulas:

Storage estimation:
Total storage = Object size × Number of objects × Replication factor × Time period

Example: 1 million users × 10 KB profile × 3 replicas = 30 GB

Storage growth/year = Daily new data × 365
Example: 100 GB/day × 365 = 36.5 TB/year

Bandwidth estimation:
Peak bandwidth = Average bandwidth × Peak factor (typically 2-10×)
Daily bandwidth = Average request size × Daily requests

QPS/RPS estimation:
Average QPS = Total requests ÷ Time period in seconds
Peak QPS = Average QPS × Peak factor (typically 2-3×)

Cache size estimation:
Cache size = Working set × Object size
Working set = Total objects × Access frequency factor (typically 20%)

Example: 1M users, 20% daily active, 10 KB each = 2 GB cache

Number of servers:
Servers needed = Peak QPS ÷ QPS per server
Add 30% headroom for failover

Example: 10,000 QPS ÷ 1,000 QPS/server × 1.3 = 13 servers

Time Units

Unit In Seconds
1 nanosecond (ns) 10⁻⁹ s
1 microsecond (µs) 10⁻⁶ s
1 millisecond (ms) 10⁻³ s

Powers of Two

Power Value Approx. Binary (IEC) Decimal (SI)
2¹⁰ 1,024 ~1 thousand 1 KiB 1 KB
2²⁰ 1,048,576 ~1 million 1 MiB 1 MB
2³⁰ 1,073,741,824 ~1 billion 1 GiB 1 GB
2⁴⁰ 1,099,511,627,776 ~1 trillion 1 TiB 1 TB
2⁵⁰ ~1 quadrillion 1 PiB 1 PB

Binary (IEC) vs Decimal (SI) Units

Binary (IEC) Exact Value Decimal (SI) Exact Value Difference
1 KiB (kibibyte) 1,024 bytes 1 KB (kilobyte) 1,000 bytes 2.4%
1 MiB (mebibyte) 1,048,576 bytes 1 MB (megabyte) 1,000,000 bytes 4.9%
1 GiB (gibibyte) 1,073,741,824 bytes 1 GB (gigabyte) 1,000,000,000 bytes 7.4%
1 TiB (tebibyte) 1,099,511,627,776 bytes 1 TB (terabyte) 1,000,000,000,000 bytes 10.0%
1 PiB (pebibyte) 1,125,899,906,842,624 bytes 1 PB (petabyte) 1,000,000,000,000,000 bytes 12.6%

Usage conventions:

Key formulas:

1 KiB = 2¹⁰ bytes = 1,024 bytes
1 MiB = 2²⁰ bytes = 1,024 KiB ≈ 1.049 MB
1 GiB = 2³⁰ bytes = 1,024 MiB ≈ 1.074 GB
1 TiB = 2⁴⁰ bytes = 1,024 GiB ≈ 1.100 TB
1 PiB = 2⁵⁰ bytes = 1,024 TiB ≈ 1.126 PB

Quick estimate: 2¹⁰ ≈ 10³ (off by 2.4%)
Therefore: 2ⁿ ≈ 10^(n × 0.3)

Converting binary to decimal:
GiB to GB: multiply by 1.074
TiB to TB: multiply by 1.100

Example: 500 GiB SSD = 500 × 1.074 = 537 GB
Example: "1 TB" HDD = 1000 GB = 931 GiB (what OS shows)

Database & Cache Latencies

Operation Latency Notes
Redis GET (local) 0.1–0.5 ms In-memory, same datacenter
Redis GET (cross-AZ) 1–2 ms Network overhead
Memcached GET 0.1–0.5 ms Similar to Redis
PostgreSQL simple query 1–5 ms Indexed, warm cache
PostgreSQL complex query 10–100+ ms Joins, aggregations
MySQL simple query 1–5 ms Indexed, warm cache
MongoDB document fetch 1–5 ms By _id, indexed
Elasticsearch query 10–50 ms Depends on index size
SQLite (local file) 0.01–0.1 ms No network overhead

Key formulas:

Cache hit ratio:
Hit ratio = Cache hits ÷ (Cache hits + Cache misses)
Target: > 90% for effective caching

Average latency with cache:
Avg latency = (Hit ratio × Cache latency) + ((1 - Hit ratio) × DB latency)

Example: 95% hit ratio, 1ms cache, 10ms DB
= (0.95 × 1) + (0.05 × 10) = 0.95 + 0.5 = 1.45 ms

Cache capacity planning:
Memory needed = Number of cached items × Average item size × Overhead (1.2-1.5×)

Database connections:
Max connections = (Number of servers × Connections per server)
Rule of thumb: PostgreSQL handles ~100-500 connections well
Use connection pooling (PgBouncer) for more

IOPS estimation:
Required IOPS = QPS × Queries per request × (1 + Cache miss ratio)

Cloud Storage Latencies

Operation Latency Notes
S3 / GCS first byte (same region) 20–100 ms Varies by object size
S3 / GCS GET (small object) 50–200 ms < 1 MB
S3 / GCS GET (large object) Throughput-limited ~100 MB/s per connection
S3 / GCS PUT 100–300 ms Single-part upload
S3 / GCS LIST (1000 objects) 100–300 ms Pagination adds latency
CloudFront / CDN edge hit 5–20 ms Depends on user location
CDN cache miss +50–150 ms Origin fetch added

Cloud storage throughput:

Key formulas:

S3 cost estimation:
Storage cost = GB stored × $0.023/GB/month (standard tier)
Request cost = (PUT requests × $0.005/1000) + (GET requests × $0.0004/1000)
Transfer cost = GB transferred out × $0.09/GB (to internet)

CDN cost estimation:
CDN cost = GB transferred × $0.085/GB (first 10TB)
Savings = Origin requests avoided × Origin cost per request

Multipart upload calculation:
Parts needed = File size ÷ Part size (min 5MB, max 5GB)
Upload time = File size ÷ (Parallel connections × Per-connection throughput)

Example: 10 GB file, 10 parallel connections, 100 MB/s each
= 10,000 MB ÷ (10 × 100 MB/s) = 10 seconds

Typical Payload Sizes

Data Type Typical Size Notes
Tweet / short text 250–500 bytes With metadata
JSON overhead per field 10–20 bytes Key names, quotes, delimiters
Protobuf overhead 2–5 bytes/field ~3–5× smaller than JSON
Average web page 2–3 MB With images, JS, CSS
HTML document only 50–100 KB Without assets
Small profile image 10–50 KB Compressed JPEG/WebP
High-res photo 2–5 MB JPEG
1 min video (720p) 50–100 MB Compressed
1 min video (1080p) 100–200 MB Compressed
1 min audio (MP3) 1–2 MB 128 kbps
Average email 50–100 KB With headers, no attachments
Log line 200–500 bytes Typical structured log
Kafka message (typical) 1–10 KB Varies by use case

Key formulas:

Image size estimation:
Uncompressed = Width × Height × Bytes per pixel (3 for RGB, 4 for RGBA)
Example: 1920×1080×3 = 6.2 MB uncompressed
Compressed JPEG ≈ Uncompressed ÷ 10 = ~600 KB

Video size estimation:
Bitrate (Mbps) × Duration (seconds) ÷ 8 = Size in MB
Example: 5 Mbps × 60 sec ÷ 8 = 37.5 MB

Audio size estimation:
Bitrate (kbps) × Duration (seconds) ÷ 8 = Size in KB
Example: 128 kbps × 60 sec ÷ 8 = 960 KB ≈ 1 MB

Log storage estimation:
Daily logs = Log lines per second × 86,400 × Avg log size
Example: 1000 logs/sec × 86,400 × 300 bytes = 25.9 GB/day

JSON vs Protobuf:
Protobuf size ≈ JSON size × 0.3
Protobuf parse time ≈ JSON parse time × 0.1

Serialization Format Comparison

Format Size Encode Speed Decode Speed Human Readable
JSON Baseline Medium Medium ✅ Yes
Protobuf 0.3–0.5× Fast Fast ❌ No
MessagePack 0.5–0.7× Fast Fast ❌ No
Avro 0.3–0.5× Fast Fast ❌ No
XML 1.5–2× Slow Slow ✅ Yes
CSV 0.5–0.8× Fast Medium ✅ Yes

Common System Capacities

Component Typical Limit Notes
Single Redis instance 100K–1M ops/s Depends on operation type
Single PostgreSQL 10K–50K queries/s Simple queries, good hardware
Single MySQL 10K–50K queries/s Similar to PostgreSQL
Single Kafka broker 100K–500K msgs/s Depends on message size
Single Nginx 10K–100K req/s Static content
Single Node.js 10K–30K req/s I/O bound workloads
TCP connections per server ~65K per IP Port exhaustion limit
Open file descriptors 1M+ (tunable) ulimit -n

Key formulas:

Connection limits:
Max TCP connections = 65,535 per IP (port range)
With multiple IPs: Max connections = 65,535 × Number of IPs

Thread pool sizing (CPU-bound):
Optimal threads = Number of CPU cores

Thread pool sizing (I/O-bound):
Optimal threads = Number of cores × (1 + Wait time / Service time)
Example: 8 cores, 100ms wait, 10ms service = 8 × (1 + 10) = 88 threads

Little's Law (queue sizing):
L = λ × W
L = Average number in system
λ = Arrival rate (requests/second)
W = Average time in system

Example: 1000 RPS, 100ms latency → 1000 × 0.1 = 100 concurrent requests

Amdahl's Law (parallelization speedup):
Speedup = 1 / ((1 - P) + P/N)
P = Parallelizable fraction
N = Number of processors

Example: 90% parallelizable, 10 cores → 1 / (0.1 + 0.9/10) = 5.26× speedup

Message Queue Comparison

System Throughput Latency Ordering Notes
Kafka 500K–1M msgs/s 2–10 ms Per-partition Best for high throughput, replay
RabbitMQ 20K–50K msgs/s 1–5 ms Per-queue Best for complex routing
Amazon SQS 3K msgs/s (standard) 10–50 ms Best-effort Fully managed, scales infinitely
Amazon SQS FIFO 300 msgs/s 10–50 ms Strict FIFO Use for ordering guarantees
Redis Streams 100K+ msgs/s < 1 ms Per-stream Good for simple pub/sub
Google Pub/Sub 100K+ msgs/s 10–50 ms Per-topic Fully managed, global

When to use what:


Container & Kubernetes Overhead

Operation Latency/Overhead Notes
Container startup (Docker) 500 ms – 2 s Depends on image size
Container startup (from cache) 100–500 ms Image layers cached
Pod scheduling (K8s) 1–5 s Includes image pull if not cached
Pod scheduling (cached image) 500 ms – 2 s No image pull
K8s Service DNS resolution 1–10 ms CoreDNS lookup
K8s ConfigMap/Secret mount 100–500 ms At pod startup
Sidecar proxy overhead (Istio) 1–3 ms Per request
Container memory overhead 10–50 MB Runtime + base OS
K8s per-pod overhead ~100 MB kubelet, pause container

Cold start times (serverless):

Platform Cold Start Notes
AWS Lambda (Python/Node) 100–500 ms Lightweight runtimes
AWS Lambda (Java/.NET) 500 ms – 3 s JVM/CLR startup
AWS Lambda (with VPC) +200–500 ms ENI attachment
Google Cloud Functions 100–500 ms Similar to Lambda
Cloudflare Workers < 5 ms V8 isolates, no container

DNS & TLS Overhead

Operation Latency Notes
DNS lookup (cached) 0–1 ms OS or browser cache
DNS lookup (recursive) 20–100 ms Full resolution
DNS lookup (authoritative) 10–50 ms Direct to nameserver
DNS TTL (typical) 60–300 s Balance freshness vs. latency
TLS handshake (TLS 1.2) 2 RTT ~100–300 ms cross-region
TLS handshake (TLS 1.3) 1 RTT ~50–150 ms cross-region
TLS resumption 0–1 RTT Session tickets
mTLS overhead +1–5 ms Certificate verification
Certificate verification 1–10 ms OCSP/CRL checks can add more

Optimization tips:


Cost Ballpark (Cloud, 2025)

Resource Approximate Cost Notes
Compute (on-demand) $0.04–0.10/hr per vCPU AWS/GCP/Azure
Compute (spot/preemptible) $0.01–0.03/hr per vCPU 60–80% savings
Memory $0.005/GB/hr Included with compute
S3/GCS storage $0.02–0.03/GB/month Standard tier
S3/GCS requests $0.0004/1K GETs PUTs cost more
Data transfer (egress) $0.05–0.12/GB Inter-region/internet
Data transfer (same region) Free–$0.01/GB Same AZ often free
Managed Kafka $0.10–0.20/hr per broker MSK, Confluent
Managed Redis $0.02–0.05/GB/hr ElastiCache
Managed PostgreSQL $0.02–0.10/hr per vCPU RDS
Lambda invocation $0.20/1M requests + duration cost
API Gateway $1–3.50/1M requests Depends on tier

Rule of thumb: Egress costs often dominate — design to minimize cross-region and internet data transfer.


Common Data Type Sizes

Primitive Types (C/Java/Go)

Type Size Range/Notes
bool 1 byte true/false
char 1 byte -128 to 127
short 2 bytes -32K to 32K
int 4 bytes -2.1B to 2.1B
long 8 bytes -9.2 quintillion to 9.2 quintillion
float 4 bytes ~7 decimal digits precision
double 8 bytes ~15 decimal digits precision
Pointer (64-bit) 8 bytes Memory address

PostgreSQL Numeric Types

Type Size Range/Notes
smallint 2 bytes -32,768 to 32,767
integer 4 bytes -2.1B to 2.1B
bigint 8 bytes -9.2 quintillion to 9.2 quintillion
serial 4 bytes Auto-increment (1 to 2.1B)
bigserial 8 bytes Auto-increment (1 to 9.2 quintillion)
decimal/numeric Variable Up to 131,072 digits; exact precision
real 4 bytes 6 decimal digits precision
double precision 8 bytes 15 decimal digits precision
money 8 bytes -92 quadrillion to 92 quadrillion

PostgreSQL String Types

Type Size Notes
char(n) n bytes Fixed-length, space-padded
varchar(n) Up to n + 1–4 bytes Variable-length with limit
text Variable Unlimited length
bytea Variable Binary data (byte array)

String storage: 1–4 bytes overhead + actual string length. Strings > 2KB are automatically compressed (TOAST).

PostgreSQL Date/Time Types

Type Size Range/Notes
timestamp 8 bytes 4713 BC to 294276 AD; microsecond precision
timestamptz 8 bytes Same + timezone aware
date 4 bytes 4713 BC to 5874897 AD
time 8 bytes 00:00:00 to 24:00:00; microsecond precision
timetz 12 bytes Time with timezone
interval 16 bytes Time span

PostgreSQL Other Common Types

Type Size Notes
boolean 1 byte true/false/null
uuid 16 bytes 128-bit universally unique identifier
json Variable Stored as text, parsed on access
jsonb Variable Binary JSON, faster queries, ~10–20% larger
inet 7 or 19 bytes IPv4 or IPv6 address
cidr 7 or 19 bytes IPv4 or IPv6 network
macaddr 6 bytes MAC address
point 16 bytes 2D point (x, y)
box 32 bytes Rectangular box
array 24 bytes + data Array header + element storage
hstore Variable Key-value pairs
tsvector Variable Full-text search document
tsquery Variable Full-text search query

Common Identifiers

Type Size Notes
UNIX timestamp 4 bytes Seconds since 1970 (until 2038)
UNIX timestamp (64-bit) 8 bytes Milliseconds, no 2038 problem
UUID 16 bytes 128-bit, globally unique
ULID 16 bytes UUID + timestamp, sortable
Snowflake ID 8 bytes Twitter-style, sortable
MongoDB ObjectId 12 bytes Timestamp + machine + process + counter
Auto-increment int 4 bytes ~2.1B max IDs
Auto-increment bigint 8 bytes ~9.2 quintillion max IDs

Key formulas:

UUID collision probability (Birthday Problem):
P(collision) ≈ 1 - e^(-n² / 2 × 2¹²²)
For 1 billion UUIDs: P ≈ 10⁻¹⁸ (effectively zero)

ID space exhaustion:
Time to exhaust = Max IDs ÷ IDs per second
32-bit at 1000/s = 2.1B ÷ 1000 = 24 days
64-bit at 1M/s = 9.2 quintillion ÷ 1M = 292,000 years

Snowflake ID structure (64-bit):
- 1 bit: sign (always 0)
- 41 bits: timestamp (69 years from epoch)
- 10 bits: machine ID (1024 machines)
- 12 bits: sequence (4096 IDs/ms/machine)
Max: 4096 × 1000 × 1024 = 4.2 billion IDs/second globally

PostgreSQL Row Overhead

Component Size Notes
Tuple header 23 bytes Per row
NULL bitmap 1 byte per 8 columns Tracks NULL values
Alignment padding 0–7 bytes 8-byte alignment
TOAST pointer 18 bytes For large values stored externally

Key formulas:

Row size estimation:
Row size = 23 bytes (header) + NULL bitmap + Column data + Padding

Table size estimation:
Table size = Number of rows × Average row size × Bloat factor (1.2-1.5)

Index size estimation:
B-tree index ≈ Number of rows × (Key size + 8 bytes) × 2-3× overhead
Example: 10M rows, 8-byte key = 10M × 16 × 2.5 = 400 MB

TOAST threshold: Values > 2KB are compressed/stored externally

Pages and blocks:
PostgreSQL page size = 8 KB
Rows per page ≈ 8192 ÷ Row size (minus page header ~24 bytes)
Example: 100-byte rows → ~80 rows per page

Fill factor (default 90% for tables):
Usable space per page = 8192 × 0.9 = 7373 bytes