Skip to content

Horizontal Pod Autoscaler

The Horizontal Pod Autoscaler (HPA) is a Kubernetes feature that automatically adjusts the number of Pod replicas in a deployment or replication controller based on observed CPU utilization.^[400-devops-07-monitoring-and-observability-k8s-istio-samples-helloworld-readme.md]

Prerequisites

For the Horizontal Pod Autoscaler to function correctly, CPU requests must be defined.^[400-devops-07-monitoring-and-observability-k8s-istio-samples-helloworld-readme.md] Specifically, all containers in the pods must have CPU requests configured.^[400-devops-07-monitoring-and-observability-k8s-istio-samples-helloworld-readme.md] In environments using service meshes like Istio, this includes the injected sidecar containers (e.g., istio-proxy), which must also include CPU requests to allow the service to scale.^[400-devops-07-monitoring-and-observability-k8s-istio-samples-helloworld-readme.md]

Usage

To enable autoscaling, the kubectl autoscale command is used to create an HPA resource.^[400-devops-07-monitoring-and-observability-k8s-istio-samples-helloworld-readme.md] This command typically requires parameters to set the target CPU utilization percentage (--cpu-percent) as well as minimum (--min) and maximum (--max) replica limits.^[400-devops-07-monitoring-and-observability-k8s-istio-samples-helloworld-readme.md]

You can verify the status of the autoscaler and the current number of replicas using kubectl get hpa.^[400-devops-07-monitoring-and-observability-k8s-istio-samples-helloworld-readme.md]

Sources

^[400-devops-07-monitoring-and-observability-k8s-istio-samples-helloworld-readme.md]