Autoscaling Documentation

Everything you need to understand, choose, and implement Kubernetes autoscaling.

Adds or removes pod replicas based on CPU, memory, or custom metrics.

Automatically adjusts CPU and memory requests for containers based on usage patterns.

Adds or removes nodes in the cluster when pods can't be scheduled or nodes are underutilized.

Event-driven autoscaling based on external sources like queues, streams, and cron schedules. Supports scale-to-zero.

Comparison Matrix

Feature	HPA	VPA	CA	KEDA
What it scales	Pod replicas	Pod resources	Cluster nodes	Pod replicas (event)
Direction	Horizontal	Vertical	Horizontal (infra)	Horizontal
Scale to zero	No	No	Yes (nodes)	Yes
Built-in to K8s	Yes	No	No	No
Complexity	Low	Medium	Medium	Medium
Reaction time	15-60s	Minutes	2-10 min	10-30s
Best for	Web APIs	Databases	Infra elasticity	Queue consumers