Skip to content

HPA and VPA incompatibility

The HPA and VPA incompatibility refers to the operational conflict that arises when attempting to simultaneously use Horizontal Pod Autoscaler (HPA) and Vertical Pod Autoscaler (VPA) on the same workload.^[400-devops__06-Kubernetes__k8s-ithelp__Day25__README.md]

Mechanism of Conflict

The incompatibility stems from the fundamental way each autoscaler manages resource definitions:

  • VPA Behavior: The VPA automatically adjusts a Pod's resources.requests and limits (CPU and memory) based on historical usage data. To apply these new configurations, the VPA must evict the existing Pod and create a new one.^[400-devops__06-Kubernetes__k8s-ithelp__Day25__README.md]
  • HPA Behavior: The HPA adjusts the number of replicas based on resource utilization metrics. The calculation for scaling relies on the ratio of current usage to the defined requests.1

When both operate on standard metrics like CPU, a feedback loop occurs. If the VPA increases the resource requests, the utilization percentage reported to the HPA drops. Consequently, the HPA interprets this as low demand and scales the number of replicas down, potentially causing system instability.

Workarounds

While direct usage on standard metrics is discouraged, there are specific configurations that allow them to coexist:

  • Custom Metrics: The conflict can be avoided if the HPA is configured to use custom or external metrics as the trigger for autoscaling, rather than the standard CPU or memory metrics that the VPA modifies.^[400-devops__06-Kubernetes__k8s-ithelp__Day25__README.md]
  • Multidim Pod Autoscaler (MPA): A more advanced solution, currently available as a beta feature on platforms like [[GCP GKE]], is Multidim Pod Autoscaler. MPA allows for multidimensional scaling, such as utilizing HPA for CPU scaling while simultaneously utilizing VPA for memory scaling.^[400-devops__06-Kubernetes__k8s-ithelp__Day25__README.md]

Sources


  1. This relationship is mathematically defined as desiredReplicas = ceil[currentReplicas * (currentMetricValue / desiredMetricValue)], where desiredMetricValue corresponds to the target set on the request.^[400-devops__06-Kubernetes__k8s-ithelp__Day25__README.md]