Autoscaling: HPA, VPA, Cluster Autoscaler
On this page
Autoscaling Components
- HPA: scales replicas based on metrics (CPU/custom).
- VPA: adjusts requests (and sometimes limits) for containers.
- Cluster Autoscaler: adds/removes nodes based on pending pods.
Operational Guardrails
- Set maxReplicas and sane stabilization windows.
- Watch for metric gaps (metrics-server/Prometheus outages).
- Prevent scaling storms via rate limits and step policies.
Inspect HPA
kubectl -n <ns> get hpa kubectl -n <ns> describe hpa <name> | sed -n '1,220p' kubectl -n <ns> get events --sort-by=.lastTimestamp | tail -25
Detect Capacity Issues
kubectl get pods -A --field-selector=status.phase=Pending kubectl -n <ns> describe pod <pod> | grep -n 'Events' -A60 kubectl get nodes
Failure Modes
- HPA stuck: metrics missing or targets misconfigured.
- Scale up → Pending: cluster has no capacity; CA not working.
- Scale down too fast: flapping causes errors and cold caches.