Cluster Autoscaler

Add or remove cluster nodes based on pending pods and utilization.

The Cluster Autoscaler adjusts the number of nodes in your cluster. It adds nodes when pods can't be scheduled (pending) and removes underutilized nodes whose pods can be rescheduled elsewhere.

How It Works

The Cluster Autoscaler scans every 10 seconds. For scale-up: it detects pods in Pending state, evaluates which node group can satisfy requirements, and calls the cloud provider API to provision new VMs. For scale-down: it checks if nodes are below 50% utilization for 10+ minutes and safely drains them.

It uses an expander strategy to choose which node group to scale: least-waste (recommended), most-pods, random, or priority.

When to Use

Pod autoscalers (HPA/KEDA) create more pods than nodes can handle
You want automatic infrastructure elasticity
Significant load variance (daily cycles, seasonal spikes)
Cost optimization by removing idle nodes

When NOT to Use

Bare metal clusters (no dynamic node provisioning)
Latency-sensitive workloads (2-5 min node provisioning)
Fixed-capacity requirements or compliance constraints

Real-World Example

Black Friday Traffic Spike

An e-commerce platform runs on 20 nodes normally. On Black Friday, HPA scales the frontend from 40 to 600 pods. The Cluster Autoscaler detects hundreds of Pending pods, provisions 65 additional c5.2xlarge EC2 instances over 12 minutes (20 → 85 nodes). By Monday, traffic normalizes and nodes are gradually drained back to 20.

Step-by-Step Implementation (AWS EKS)

1. Configure node group

bash

aws eks create-nodegroup \
  --cluster-name my-cluster \
  --nodegroup-name standard-workers \
  --instance-types c5.2xlarge \
  --scaling-config minSize=2,maxSize=100,desiredSize=5

2. Deploy via Helm

bash

helm repo add autoscaler https://kubernetes.github.io/autoscaler
helm install cluster-autoscaler autoscaler/cluster-autoscaler \
  --namespace kube-system \
  --set autoDiscovery.clusterName=my-cluster \
  --set awsRegion=us-east-1 \
  --set extraArgs.expander=least-waste \
  --set extraArgs.scale-down-unneeded-time=10m \
  --set extraArgs.scale-down-utilization-threshold=0.5

3. Verify

bash

kubectl logs -n kube-system -l app=cluster-autoscaler --tail=50
kubectl get nodes --watch

Common Pitfalls

Pitfall	Symptom	Fix
Node provisioning too slow	Pods Pending for 3-5+ min	Use Karpenter for faster provisioning; or maintain warm spare nodes
Nodes not scaling down	Idle nodes remain	Check for pods with local storage or restrictive PDBs
IAM / permissions errors	AccessDenied in CA logs	Ensure service account has autoscaling:* and ec2:Describe* permissions
PDB blocking scale-down	CA refuses to evict pods	Ensure PDBs allow at least 1 pod eviction