Skip to content

HPA scaling behavior configuration

HPA scaling behavior configuration refers to the behavior field in the Horizontal Pod Autoscaler (HPA) API that defines the precise scaling velocity and constraints for both scaling up and scaling down.^[400-devops-06-kubernetes-k8s-ithelp-day26-readme.md]

Stabilization Window

The core mechanism for preventing replica "flapping" or excessive jitter is the stabilizationWindowSeconds parameter.^[400-devops-06-kubernetes-k8s-ithelp-day26-readme.md] When a metric indicates that the target should scale down, the autoscaling algorithm does not react immediately; instead, it reviews the calculated desired states from the past and uses the maximum value found within the specified window.^[400-devops-06-kubernetes-k8s-ithelp-day26-readme.md]

For example, a scale-down stabilization window of 300 seconds implies that the controller looks at the highest replica count requested in the last 5 minutes, effectively buffering against temporary drops in load.^[400-devops-06-kubernetes-k8s-ithelp-day26-readme.md]

Scaling Policies

Scaling velocity is controlled by policies, which define limits on how many replicas can be added or removed over a specific period.^[400-devops-06-kubernetes-k8s-ithelp-day26-readme.md]

Policy Types

Policies can be defined using two distinct types: * Percent: Changes the replica count by a percentage of the current amount (e.g., 100%).^[400-devops-06-kubernetes-k8s-ithelp-day26-readme.md] * Pods: Changes the replica count by a fixed absolute number (e.g., 4).^[400-devops-06-kubernetes-k8s-ithelp-day26-readme.md]

Each policy must also specify a periodSeconds, which sets the time window (e.g., 15 seconds) for that specific rate limit.^[400-devops-06-kubernetes-k8s-ithelp-day26-readme.md]

Selection Policy

When multiple policies are defined for the same direction (scale up or down), a selectPolicy determines which one is applied.^[400-devops-06-kubernetes-k8s-ithelp-day26-readme.md] A common strategy for scaling up is to set the policy to Max, which chooses the policy that allows the most significant change (e.g., maxing out between adding 4 pods or adding 100%).^[400-devops-06-kubernetes-k8s-ithelp-day26-readme.md]

Configuration Example

The following snippet demonstrates a configuration that restricts scale-down speed to prevent oscillation while allowing rapid scale-up:

behavior:
  scaleDown:
    stabilizationWindowSeconds: 300
    policies:
    - type: Percent
      value: 100
      periodSeconds: 15
  scaleUp:
    stabilizationWindowSeconds: 0
    policies:
    - type: Percent
      value: 100
      periodSeconds: 15
    - type: Pods
      value: 4
      periodSeconds: 15
    selectPolicy: Max
^[400-devops-06-kubernetes-k8s-ithelp-day26-readme.md]

Sources

^[400-devops-06-kubernetes-k8s-ithelp-day26-readme.md]