HPA and VPA incompatibility¶
The HPA and VPA incompatibility refers to the operational conflict that arises when attempting to simultaneously use Horizontal Pod Autoscaler (HPA) and Vertical Pod Autoscaler (VPA) on the same workload.^[400-devops__06-Kubernetes__k8s-ithelp__Day25__README.md]
Mechanism of Conflict¶
The incompatibility stems from the fundamental way each autoscaler manages resource definitions:
- VPA Behavior: The VPA automatically adjusts a Pod's
resources.requestsandlimits(CPU and memory) based on historical usage data. To apply these new configurations, the VPA must evict the existing Pod and create a new one.^[400-devops__06-Kubernetes__k8s-ithelp__Day25__README.md] - HPA Behavior: The HPA adjusts the number of
replicasbased on resource utilization metrics. The calculation for scaling relies on the ratio of current usage to the definedrequests.1
When both operate on standard metrics like CPU, a feedback loop occurs. If the VPA increases the resource requests, the utilization percentage reported to the HPA drops. Consequently, the HPA interprets this as low demand and scales the number of replicas down, potentially causing system instability.
Workarounds¶
While direct usage on standard metrics is discouraged, there are specific configurations that allow them to coexist:
- Custom Metrics: The conflict can be avoided if the HPA is configured to use custom or external metrics as the trigger for autoscaling, rather than the standard CPU or memory metrics that the VPA modifies.^[400-devops__06-Kubernetes__k8s-ithelp__Day25__README.md]
- Multidim Pod Autoscaler (MPA): A more advanced solution, currently available as a beta feature on platforms like [[GCP GKE]], is Multidim Pod Autoscaler. MPA allows for multidimensional scaling, such as utilizing HPA for CPU scaling while simultaneously utilizing VPA for memory scaling.^[400-devops__06-Kubernetes__k8s-ithelp__Day25__README.md]
Related Concepts¶
- [[Multidim Pod Autoscaler]]
- Cluster Autoscaler
- Kubernetes
Sources¶
-
This relationship is mathematically defined as
desiredReplicas = ceil[currentReplicas * (currentMetricValue / desiredMetricValue)], wheredesiredMetricValuecorresponds to the target set on the request.^[400-devops__06-Kubernetes__k8s-ithelp__Day25__README.md] ↩