Skip to content

CPU request configuration for autoscaling

CPU request configuration for autoscaling is a required setup for enabling the Kubernetes Horizontal Pod Autoscaler to function correctly^[400-devops__07-Monitoring-and-Observability__k8s-istio__samples__helloworld__README.md]. This configuration ensures that resource utilization Metrics can be calculated to trigger scaling events.

Mechanism

The Horizontal Pod Autoscaler (HPA) relies on metrics—specifically CPU utilization percentage—to determine when to scale the number of pod replicas up or down^[400-devops__07-Monitoring-and-Observability__k8s-istio__samples__helloworld__README.md]. To calculate this percentage, the system must have a defined baseline of resource allocation.

Because the autoscaler works by comparing current usage against requested limits, every container within a pod must have a CPU request defined^[400-devops__07-Monitoring-and-Observability__k8s-istio__samples__helloworld__README.md]. If any container lacks a CPU request, the autoscaler will not function as expected because the utilization percentage cannot be accurately determined^[400-devops__07-Monitoring-and-Observability__k8s-istio__samples__helloworld__README.md].

Application in Service Meshes

In environments using a Service mesh like Istio, this requirement extends to the infrastructure sidecars as well. When Automatic sidecar injection is used, the injected istio-proxy containers must also include CPU requests^[400-devops__07-Monitoring-and-Observability__k8s-istio__samples__helloworld__README.md].

For example, in the provided sample service helloworld, the deployment containers are explicitly configured with CPU requests, and the injected Istio proxies are similarly configured to ensure the service is ready for autoscaling^[400-devops__07-Monitoring-and-Observability__k8s-istio__samples__helloworld__README.md].

Sources

  • 400-devops__07-Monitoring-and-Observability__k8s-istio__samples__helloworld__README.md