Capacity Estimation Basics
Why Rough Math Beats Guessing
Capacity estimation is not about perfect forecasting. It is about avoiding blind design. Even basic back-of-the-envelope estimates help you choose sane defaults for caching, databases, queues, and scaling strategy.
Start With Workload Shape
- Average RPS and peak RPS
- Burst duration (seconds vs minutes)
- Read/write ratio
- Payload sizes (request and response)
- Hot keys (popular users/items)
Translate Traffic Into Resource Demand
Each request consumes resources: CPU, memory, DB queries, and network. If you know “work per request,” you can estimate bottlenecks.
Example: Peak traffic: 5,000 RPS Per request: 1 DB query + 1 cache query Peak DB QPS ≈ 5,000 Peak cache QPS ≈ 5,000
Database Connection Reality
Databases are often limited by concurrent connections. If each request requires a DB connection for 50ms, the maximum throughput per connection is about 20 ops/sec.
If avg DB time = 50ms: Ops per connection ≈ 1 / 0.05 = 20 ops/sec To support 5,000 DB ops/sec: Required concurrent connections ≈ 5,000 / 20 = 250 Then add headroom (e.g., 30%) -> ~325
Storage Growth
Estimate data growth early to avoid painful migrations. You need to know: daily ingest, retention, and indexing overhead.
If uploads = 50GB/day and retention = 365 days: Raw storage ≈ 18.25TB/year (before replication and overhead)
Headroom and Safety Margins
Production systems need headroom. Running at 90–100% saturation guarantees tail latency spikes and instability. Target sustainable utilization (often 50–70% for critical bottlenecks) and scale before you hit the cliff.
Production-First Takeaway
Capacity estimation converts requirements into constraints: QPS, concurrency, storage, and bandwidth. You do not need exact numbers—just enough to choose the right bottleneck strategy and avoid accidental over-engineering or under-provisioning.