HPA scaling behavior policies¶
HPA scaling behavior policies are configuration options within the Horizontal Pod Autoscaler (HPA) that govern the rate of change for scaling actions. These policies are defined under the behavior field in the HPA specification and allow administrators to fine-tune how aggressively the cluster scales up or down in response to fluctuating metrics^[400-devops__06-Kubernetes__k8s-ithelp__Day26__README.md].
Core Concepts¶
The primary goal of these policies is to prevent excessive instability in the number of replicas, often referred to as "flapping," where the system continuously scales up and down^[400-devops__06-Kubernetes__k8s-ithelp__Day26__README.md]. To achieve this, the behavior settings are divided into scaleUp and scaleDown configurations^[400-devops__06-Kubernetes__k8s-ithelp__Day26__README.md].
Stabilization Window¶
The stabilization window is a critical mechanism for smoothing out scaling decisions, primarily during scale-down operations^[400-devops__06-Kubernetes__k8s-ithelp__Day26__README.md].
When a metric indicates that a target should scale down, the autoscaling algorithm does not act immediately. Instead, it reviews the calculated desired states over a specific interval (stabilizationWindowSeconds) and uses the highest value found within that period^[400-devops__06-Kubernetes__k8s-ithelp__Day26__README.md]. This effectively creates a cooldown period, ensuring that a brief dip in metrics does not trigger an immediate reduction in replicas, thereby preventing the replica count from oscillating too frequently^[400-devops__06-Kubernetes__k8s-ithelp__Day26__README.md].
Scaling Policies¶
Scaling policies define the exact limits on how many replicas can be added or removed within a given time window^[400-devops__06-Kubernetes__k8s-ithelp__Day26__README.md].
- Period (
periodSeconds): Defines the duration of the time window for the policy^[400-devops__06-Kubernetes__k8s-ithelp__Day26__README.md]. - Policy Types:
- Percent: Specifies the maximum percentage of replicas that can change per period^[400-devops__06-Kubernetes__k8s-ithelp__Day26__README.md].
- Pods: Specifies the maximum absolute number of replicas that can change per period^[400-devops__06-Kubernetes__k8s-ithelp__Day26__README.md].
- Selection Policy (
selectPolicy): When multiple policies are specified (e.g., one for percentage and one for pods), theselectPolicydetermines which one to apply. For example, aselectPolicyofMaxchooses the policy that allows the highest replica count change^[400-devops__06-Kubernetes__k8s-ithelp__Day26__README.md].
Default Behavior¶
While stabilization windows are crucial for scaling down to prevent jitter, scaling up often needs to be immediate to handle sudden traffic spikes^[400-devops__06-Kubernetes__k8s-ithelp__Day26__README.md]. The default settings reflect this requirement:
- Scale Down: Uses a stabilization window to dampen changes^[400-devops__06-Kubernetes__k8s-ithelp__Day26__README.md].
- Scale Up: Often defaults to a stabilization window of
0seconds, allowing for instant expansion if necessary^[400-devops__06-Kubernetes__k8s-ithelp__Day26__README.md].
Example Configuration¶
The following configuration snippet illustrates how to define specific scaling behaviors, such as limiting scale-down speed to 100% every 15 seconds and allowing rapid scale-up^[400-devops__06-Kubernetes__k8s-ithelp__Day26__README.md]:
behavior:
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 100
periodSeconds: 15
scaleUp:
stabilizationWindowSeconds: 0
policies:
- type: Percent
value: 100
periodSeconds: 15
- type: Pods
value: 4
periodSeconds: 15
selectPolicy: Max
Related Concepts¶
- Horizontal Pod Autoscaler
- Kubernetes
- [[Metric Server]]
- [[Deployment]]
Sources¶
400-devops__06-Kubernetes__k8s-ithelp__Day26__README.md