HPA metrics evolution and API versions¶
Horizontal Pod Autoscaler (HPA) metrics capabilities and API definitions have evolved significantly across Kubernetes versions. The API version used dictates which resource metrics are available for autoscaling triggers, with support for memory metrics being a key differentiator in later versions.^[400-devops__06-Kubernetes__k8s-ithelp__Day25__README.md]
API Version Evolution¶
The Kubernetes HPA API has updated rapidly, resulting in distinct versions that offer different feature sets.^[400-devops__06-Kubernetes__k8s-ithelp__Day25__README.md]
autoscaling/v1: This is the original stable version. It is restricted to scaling based solely on CPU utilization metrics.^[400-devops__06-Kubernetes__k8s-ithelp__Day25__README.md]autoscaling/v2beta2: This version introduced support for Memory metrics.^[400-devops__06-Kubernetes__k8s-ithelp__Day25__README.md] It also enables the configuration of custom metrics and external metrics to trigger autoscaling logic.^[400-devops__06-Kubernetes__k8s-ithelp__Day25__README.md]
Due to these differences, it is recommended to consult the latest API documentation to ensure compatibility, as online resources may still reference older versions like v2beta1 or v2 that lack these capabilities.^[400-devops__06-Kubernetes__k8s-ithelp__Day25__README.md]
Scaling Mechanics¶
HPA operates as a pod-level autoscaler, adjusting the replica count of a deployment based on observed metrics against a target value.^[400-devops__06-Kubernetes__k8s-ithelp__Day25__README.md]
- Scale-up: When the
currentMetricValueexceeds thedesiredMetricValue, the controller increases the replicas. The calculation follows the formula:desiredReplicas = ceil[currentReplicas * (currentMetricValue / desiredMetricValue)].^[400-devops__06-Kubernetes__k8s-ithelp__Day25__README.md] - Scale-down: When usage drops below the target threshold, the controller reduces the replica count.^[400-devops__06-Kubernetes__k8s-ithelp__Day25__README.md]
To prevent flapping, the system waits for a stabilization period (typically 3 to 5 minutes) after a scaling event before resuming metrics checks.^[400-devops__06-Kubernetes__k8s-ithelp__Day25__README.md] Any replica settings defined directly in the deployment are overridden by the HPA's calculations.^[400-devops__06-Kubernetes__k8s-ithelp__Day25__README.md]
Related Concepts¶
- Kubernetes Autoscaling
- Metrics Server
- [[VPA]]