HPA and VPA Auto mode incompatibility¶
HPA and VPA Auto mode incompatibility refers to a conflict that arises when the Horizontal Pod Autoscaler (HPA) is configured with standard resource Metrics (CPU/Memory) alongside a Vertical Pod Autoscaler (VPA) operating in Auto mode.^[400-devops__06-Kubernetes__k8s-ithelp__Day27__README.md]
Mechanism of Conflict¶
When VPA is set to Auto mode, it actively manages the resource requests and limits of Pods by evicting and recreating them with updated values based on usage history^[400-devops__06-Kubernetes__k8s-ithelp__Day27__README.md]. HPA, when configured to scale based on CPU or memory utilization, calculates the replica count based on the current requests defined in the Pod specification^[400-devops__06-Kubernetes__k8s-ithelp__Day27__README.md].
Because VPA continuously alters these requests, the metric baseline used by HPA fluctuates. This creates a feedback loop where VPA increases requests to accommodate load, which HPA interprets as increased capacity per replica, leading HPA to scale down the number of replicas. This results in unstable and unpredictable cluster behavior^[400-devops__06-Kubernetes__k8s-ithelp__Day27__README.md].
Recommended Resolution¶
To avoid this instability while leveraging both tools, it is recommended to use HPA for scaling workloads but configure the VPA updateMode to Off^[400-devops__06-Kubernetes__k8s-ithelp__Day27__README.md].
In Off mode, the VPA continues to monitor resource utilization and generate optimization recommendations (visible via kubectl describe vpa), but it will not automatically update the Pod specs or trigger evictions^[400-devops__06-Kubernetes__k8s-ithelp__Day27__README.md]. This allows administrators to manually apply VPA recommendations to stabilize the workload without interfering with the HPA's scaling logic.
Related Concepts¶
- Horizontal Pod Autoscaler
- Vertical Pod Autoscaler
- [[Pod Eviction]]
Sources¶
^[400-devops__06-Kubernetes__k8s-ithelp__Day27__README.md]