Cluster Autoscaler

Add or remove cluster nodes based on pending pods and utilization.

The Cluster Autoscaler adjusts the number of nodes in your cluster. It adds nodes when pods can't be scheduled (pending) and removes underutilized nodes whose pods can be rescheduled elsewhere.

How It Works

The Cluster Autoscaler scans every 10 seconds. For scale-up: it detects pods in Pending state, evaluates which node group can satisfy requirements, and calls the cloud provider API to provision new VMs. For scale-down: it checks if nodes are below 50% utilization for 10+ minutes and safely drains them.

It uses an expander strategy to choose which node group to scale: least-waste (recommended), most-pods, random, or priority.

When to Use

  • Pod autoscalers (HPA/KEDA) create more pods than nodes can handle
  • You want automatic infrastructure elasticity
  • Significant load variance (daily cycles, seasonal spikes)
  • Cost optimization by removing idle nodes

When NOT to Use

  • Bare metal clusters (no dynamic node provisioning)
  • Latency-sensitive workloads (2-5 min node provisioning)
  • Fixed-capacity requirements or compliance constraints

Real-World Example

Black Friday Traffic Spike

An e-commerce platform runs on 20 nodes normally. On Black Friday, HPA scales the frontend from 40 to 600 pods. The Cluster Autoscaler detects hundreds of Pending pods, provisions 65 additional c5.2xlarge EC2 instances over 12 minutes (20 → 85 nodes). By Monday, traffic normalizes and nodes are gradually drained back to 20.

Step-by-Step Implementation (AWS EKS)

1. Configure node group

bash
aws eks create-nodegroup \
  --cluster-name my-cluster \
  --nodegroup-name standard-workers \
  --instance-types c5.2xlarge \
  --scaling-config minSize=2,maxSize=100,desiredSize=5

2. Deploy via Helm

bash
helm repo add autoscaler https://kubernetes.github.io/autoscaler
helm install cluster-autoscaler autoscaler/cluster-autoscaler \
  --namespace kube-system \
  --set autoDiscovery.clusterName=my-cluster \
  --set awsRegion=us-east-1 \
  --set extraArgs.expander=least-waste \
  --set extraArgs.scale-down-unneeded-time=10m \
  --set extraArgs.scale-down-utilization-threshold=0.5

3. Verify

bash
kubectl logs -n kube-system -l app=cluster-autoscaler --tail=50
kubectl get nodes --watch

Common Pitfalls

PitfallSymptomFix
Node provisioning too slowPods Pending for 3-5+ minUse Karpenter for faster provisioning; or maintain warm spare nodes
Nodes not scaling downIdle nodes remainCheck for pods with local storage or restrictive PDBs
IAM / permissions errorsAccessDenied in CA logsEnsure service account has autoscaling:* and ec2:Describe* permissions
PDB blocking scale-downCA refuses to evict podsEnsure PDBs allow at least 1 pod eviction