Skip to content

Vertical Pod Autoscaler

The Vertical Pod Autoscaler (VPA) is a Kubernetes component designed to automatically adjust the CPU and memory requests and limits for containers.^[400-devops-06-kubernetes-k8s-ithelp-day27-readme.md]

Architecture

VPA operates through three main components that interact to monitor, calculate, and apply resource changes^[400-devops-06-kubernetes-k8s-ithelp-day27-readme.md].

Recommender

The Recommender is responsible for monitoring resource utilization Metrics and calculating estimated resource requirements^[400-devops-06-kubernetes-k8s-ithelp-day27-readme.md]. It reviews historical Metrics data to determine the appropriate values for requests and limits^[400-devops-06-kubernetes-k8s-ithelp-day27-readme.md].

Updater

The Updater handles the eviction of Pods that require updates^[400-devops-06-kubernetes-k8s-ithelp-day27-readme.md]. Since updating resource requests or limits typically requires restarting the service, the Updater驱除 (evicts) the necessary Pods^[400-devops-06-kubernetes-k8s-ithelp-day27-readme.md]. If the VPA configuration is set to updateMode: Auto, the Updater acts on all recommendations generated by the Recommender^[400-devops-06-kubernetes-k8s-ithelp-day27-readme.md].

Admission Controller

This component intercepts Pod creation requests via a Webhook^[400-devops-06-kubernetes-k8s-ithelp-day27-readme.md]. After the Updater evicts an old Pod and before the Deployment recreates it, the Admission Controller applies the new requests and limits values recommended by the system^[400-devops-06-kubernetes-k8s-ithelp-day27-readme.md].

Installation

Unlike the Horizontal Pod Autoscaler (HPA), which is supported by the built-in Kubernetes API, VPA is implemented as a Custom Resource Definition (CRD) and must be installed as a module^[400-devops-06-kubernetes-k8s-ithelp-day27-readme.md].

Installation typically involves cloning the official autoscaler repository and executing the vpa-up.sh script located in the vertical-pod-autoscaler directory^[400-devops-06-kubernetes-k8s-ithelp-day27-readme.md]. This script deploys the necessary CRDs, ClusterRoles, and Deployments for the recommender, updater, and admission controller^[400-devops-06-kubernetes-k8s-ithelp-day27-readme.md].

Configuration

VPA behavior is configured using a VerticalPodAutoscaler custom resource^[400-devops-06-kubernetes-k8s-ithelp-day27-readme.md].

Update Modes

The spec.updatePolicy.updateMode field determines how VPA applies changes^[400-devops-06-kubernetes-k8s-ithelp-day27-readme.md]:

  • Off: VPA generates recommendations for resource configuration but does not automatically apply them^[400-devops-06-kubernetes-k8s-ithelp-day27-readme.md].
  • Initial: VPA applies resource settings only when a Pod is created (e.g., during deployment). It does not update running Pods^[400-devops-06-kubernetes-k8s-ithelp-day27-readme.md].
  • Auto: VPA automatically applies the configuration provided by the Recommender, evicting Pods if necessary^[400-devops-06-kubernetes-k8s-ithelp-day27-readme.md].
  • Recreate: Similar to Auto, but specifically forces a recreation of the Pod on every restart^[400-devops-06-kubernetes-k8s-ithelp-day27-readme.md].

Resource Policy

The spec.resourcePolicy section defines constraints for the autoscaling logic^[400-devops-06-kubernetes-k8s-ithelp-day27-readme.md].

  • containerName: Specifies the scope of the policy; * applies to all containers in the target^[400-devops-06-kubernetes-k8s-ithelp-day27-readme.md].
  • minAllowed / maxAllowed: Sets the lower and upper boundaries for resource scaling^[400-devops-06-kubernetes-k8s-ithelp-day27-readme.md].
  • controlledResources: Specifies which Metrics to monitor, typically cpu and memory^[400-devops-06-kubernetes-k8s-ithelp-day27-readme.md].

Recommendations

Users can inspect the VPA status to see the specific recommendations proposed by the Recommender^[400-devops-06-kubernetes-k8s-ithelp-day27-readme.md]. The status output includes several key values^[400-devops-06-kubernetes-k8s-ithelp-day27-readme.md]:

  • Target: The recommended value within the allowed range to ensure the container runs optimally^[400-devops-06-kubernetes-k8s-ithelp-day27-readme.md].
  • Lower Bound: If a Pod's request falls below this threshold, VPA will evict and replace it^[400-devops-06-kubernetes-k8s-ithelp-day27-readme.md].
  • Upper Bound: If a Pod's request exceeds this threshold, VPA will evict and replace it^[400-devops-06-kubernetes-k8s-ithelp-day27-readme.md].
  • Uncapped Target: The recommended value ignoring any minAllowed or maxAllowed constraints^[400-devops-06-kubernetes-k8s-ithelp-day27-readme.md].

Interaction with HPA

When combining VPA with the Horizontal Pod Autoscaler (HPA), caution is required.^[400-devops-06-kubernetes-k8s-ithelp-day27-readme.md] Using HPA based on standard resource Metrics (CPU/Memory) simultaneously with VPA in Auto mode can lead to conflicts and unpredictable scaling behavior^[400-devops-06-kubernetes-k8s-ithelp-day27-readme.md]. A common strategy to avoid this is to use VPA in Off mode to generate recommendations, which are then used to manually configure the base resources, while HPA handles the horizontal scaling^[400-devops-06-kubernetes-k8s-ithelp-day27-readme.md].

Sources

^[400-devops-06-kubernetes-k8s-ithelp-day27-readme.md]