Cluster Autoscaler
Add or remove cluster nodes based on pending pods and utilization.
The Cluster Autoscaler adjusts the number of nodes in your cluster. It adds nodes when pods can't be scheduled (pending) and removes underutilized nodes whose pods can be rescheduled elsewhere.
How It Works
The Cluster Autoscaler scans every 10 seconds. For scale-up: it detects pods in Pending state, evaluates which node group can satisfy requirements, and calls the cloud provider API to provision new VMs. For scale-down: it checks if nodes are below 50% utilization for 10+ minutes and safely drains them.
It uses an expander strategy to choose which node group to scale: least-waste (recommended), most-pods, random, or priority.
When to Use
- Pod autoscalers (HPA/KEDA) create more pods than nodes can handle
- You want automatic infrastructure elasticity
- Significant load variance (daily cycles, seasonal spikes)
- Cost optimization by removing idle nodes
When NOT to Use
- Bare metal clusters (no dynamic node provisioning)
- Latency-sensitive workloads (2-5 min node provisioning)
- Fixed-capacity requirements or compliance constraints
Real-World Example
Black Friday Traffic Spike
An e-commerce platform runs on 20 nodes normally. On Black Friday, HPA scales the frontend from 40 to 600 pods. The Cluster Autoscaler detects hundreds of Pending pods, provisions 65 additional c5.2xlarge EC2 instances over 12 minutes (20 → 85 nodes). By Monday, traffic normalizes and nodes are gradually drained back to 20.
Step-by-Step Implementation (AWS EKS)
1. Configure node group
aws eks create-nodegroup \
--cluster-name my-cluster \
--nodegroup-name standard-workers \
--instance-types c5.2xlarge \
--scaling-config minSize=2,maxSize=100,desiredSize=52. Deploy via Helm
helm repo add autoscaler https://kubernetes.github.io/autoscaler
helm install cluster-autoscaler autoscaler/cluster-autoscaler \
--namespace kube-system \
--set autoDiscovery.clusterName=my-cluster \
--set awsRegion=us-east-1 \
--set extraArgs.expander=least-waste \
--set extraArgs.scale-down-unneeded-time=10m \
--set extraArgs.scale-down-utilization-threshold=0.53. Verify
kubectl logs -n kube-system -l app=cluster-autoscaler --tail=50
kubectl get nodes --watchCommon Pitfalls
| Pitfall | Symptom | Fix |
|---|---|---|
| Node provisioning too slow | Pods Pending for 3-5+ min | Use Karpenter for faster provisioning; or maintain warm spare nodes |
| Nodes not scaling down | Idle nodes remain | Check for pods with local storage or restrictive PDBs |
| IAM / permissions errors | AccessDenied in CA logs | Ensure service account has autoscaling:* and ec2:Describe* permissions |
| PDB blocking scale-down | CA refuses to evict pods | Ensure PDBs allow at least 1 pod eviction |