Skip to content

Kubernetes AutoScaling behavior policies

Kubernetes autoscaling behavior policies allow administrators to configure how the Horizontal Pod Autoscaler (HPA) scales the number of replicas up or down. These policies are defined within the behavior field of an HPA resource, specifically utilizing the scaleUp and scaleDown sections to control the rate and stability of scaling changes.^[400-devops__06-Kubernetes__k8s-ithelp__Day26__README.md]

Stabilization Windows

A key component of scaling behavior is the stabilization window, defined by stabilizationWindowSeconds.

When scaling down, this setting prevents the replica count from fluctuating too frequently. Instead of reacting immediately to a drop in load, the autoscaler looks at the calculated desired state over the specified interval and uses the maximum value found.^[400-devops__06-Kubernetes__k8s-ithelp__Day26__README.md] This effectively creates a cooldown period to ensure the reduction in load is sustained before removing resources.^[400-devops__06-Kubernetes__k8s-ithelp__Day26__README.md]

Scaling Policies

Scaling policies define exactly how many replicas can be added or removed during a specific time window.^[400-devops__06-Kubernetes__k8s-ithelp__Day26__README.md]

Policy Structure

Each policy is configured with three main parameters:

  • Type: The method of limiting the scale action, which can be a percentage of the current replicas or a specific number of pods^[400-devops__06-Kubernetes__k8s-ithelp__Day26__README.md].
  • Value: The specific limit corresponding to the type (e.g., 100% or 4 pods).
  • PeriodSeconds: The time window in which this limit applies^[400-devops__06-Kubernetes__k8s-ithelp__Day26__README.md].

For example, a policy might specify that over a period of 15 seconds, the replica count cannot change by more than 4 pods or 100%.^[400-devops__06-Kubernetes__k8s-ithelp__Day26__README.md]

Selection Policy

When multiple policies are defined for the same direction (e.g., multiple scaleUp policies), the selectPolicy determines how to apply them.^[400-devops__06-Kubernetes__k8s-ithelp__Day26__README.md]

A common setting is Max, which selects the policy that allows for the largest change within the constraints.^[400-devops__06-Kubernetes__k8s-ithelp__Day26__README.md]

Configuration Example

The following YAML snippet demonstrates a configuration where scale-down is restricted to a stabilization window and a single policy, while scale-up allows for more aggressive changes based on the maximum of two policies^[400-devops__06-Kubernetes__k8s-ithelp__Day26__README.md]:

behavior:
  scaleDown:
    stabilizationWindowSeconds: 300
    policies:
    - type: Percent
      value: 100
      periodSeconds: 15
  scaleUp:
    stabilizationWindowSeconds: 0
    policies:
    - type: Percent
      value: 100
      periodSeconds: 15
    - type: Pods
      value: 4
      periodSeconds: 15
    selectPolicy: Max

Sources

^[400-devops__06-Kubernetes__k8s-ithelp__Day26__README.md]