INFRA-DEVOPS Contents

Autoscaling: HPA, VPA, Cluster Autoscaler

Operate HPA/VPA/Cluster Autoscaler with clear guardrails to prevent scaling storms and capacity surprises.

On this page

Autoscaling Components

  • HPA: scales replicas based on metrics (CPU/custom).
  • VPA: adjusts requests (and sometimes limits) for containers.
  • Cluster Autoscaler: adds/removes nodes based on pending pods.

Operational Guardrails

  • Set maxReplicas and sane stabilization windows.
  • Watch for metric gaps (metrics-server/Prometheus outages).
  • Prevent scaling storms via rate limits and step policies.

Inspect HPA

kubectl -n <ns> get hpa
kubectl -n <ns> describe hpa <name> | sed -n '1,220p'
kubectl -n <ns> get events --sort-by=.lastTimestamp | tail -25

Detect Capacity Issues

kubectl get pods -A --field-selector=status.phase=Pending
kubectl -n <ns> describe pod <pod> | grep -n 'Events' -A60
kubectl get nodes

Failure Modes

  • HPA stuck: metrics missing or targets misconfigured.
  • Scale up → Pending: cluster has no capacity; CA not working.
  • Scale down too fast: flapping causes errors and cold caches.