Kubernetes HPA and VPA incompatibility¶
The Horizontal Pod Autoscaler (HPA) and Vertical Pod Autoscaler (VPA) are two mechanisms in Kubernetes designed to handle workload scaling automatically.^[400-devops-06-kubernetes-k8s-ithelp-day25-readme.md]
While they serve distinct purposes—HPA adjusts the number of Pod replicas (horizontal scaling) and VPA adjusts CPU and memory requests/limits (vertical scaling)—they are fundamentally incompatible when used together on the same resource metric.^[400-devops-06-kubernetes-k8s-ithelp-day25-readme.md]
The Conflict¶
VPA operates by monitoring resource usage and recommending optimal requests and limits. To apply these updates, the VPA Updater evicts the existing Pods, allowing the deployment to create new Pods with the updated configuration.^[400-devops-06-kubernetes-k8s-ithelp-day25-readme.md]
However, HPA calculates the number of required replicas based on the current resource utilization relative to the defined resource requests.^[400-devops-06-kubernetes-k8s-ithelp-day25-readme.md]
This creates a conflict loop:
1. VPA lowers the resource requests (scale down) to right-size the Pod.
2. HPA observes a higher utilization percentage (because the request denominator is now smaller).
3. HPA reacts by increasing the number of Pod replicas (scale up) to handle the perceived high load.
4. Conversely, if VPA increases requests, HPA may scale down excessively.
Due to this interference, the source material explicitly states that VPA and HPA cannot be mixed on the same standard Metrics (CPU or Memory).^[400-devops-06-kubernetes-k8s-ithelp-day25-readme.md]
Exceptions and Workarounds¶
There are specific configurations where conflict can be avoided:
- Custom Metrics: HPA can be configured to use custom or external metrics as a trigger for scaling, rather than CPU or memory.^[400-devops-06-kubernetes-k8s-ithelp-day25-readme.md] If HPA relies on a custom metric (e.g., requests per second), it does not conflict with VPA managing CPU or memory requests.
- Multidimensional Pod Autoscaler (MPA): Available on platforms like GCP GKE, MPA allows combining both methods by segregating the Metrics: the system scales horizontally based on CPU while scaling vertically based on memory.^[400-devops-06-kubernetes-k8s-ithelp-day25-readme.md]
Related Concepts¶
Sources¶
^[400-devops-06-kubernetes-k8s-ithelp-day25-readme.md]