Kubernetes resource requests and limits¶
Kubernetes resource requests and limits are mechanisms used to manage compute resources (CPU and Memory) within a cluster.^[400-devops-06-kubernetes-k8s-ithelp-day21-readme.md] Since Kubernetes functions as a cluster management platform comprising nodes that act as physical hosts, it must track platform-wide resource usage to allocate resources efficiently to containers and ensure they have sufficient resources to operate throughout their lifecycle.^[400-devops-06-kubernetes-k8s-ithelp-day21-readme.md]
Core Concepts¶
Kubernetes provides two primary types of constraints to ensure resources are scheduled effectively and fairly while maximizing utilization^[400-devops-06-kubernetes-k8s-ithelp-day21-readme.md]:
Requests¶
A request specifies the minimum resource requirements for a container^[400-devops-06-kubernetes-k8s-ithelp-day21-readme.md]. * It acts as the basis for scheduling decisions; a container will only be scheduled to a node if the node's allocatable resources are greater than or equal to the requested amount^[400-devops-06-kubernetes-k8s-ithelp-day21-readme.md]. * The value must be greater than or equal to zero and cannot exceed the node's capacity^[400-devops-06-kubernetes-k8s-ithelp-day21-readme.md].
Limits¶
A limit defines the maximum amount of resources a container can consume^[400-devops-06-kubernetes-k8s-ithelp-day21-readme.md].
* A limit set to 0 indicates no constraint is applied, allowing for infinite usage^[400-devops-06-kubernetes-k8s-ithelp-day21-readme.md].
* The limit value must be greater than or equal to the request, but it has no upper bound^[400-devops-06-kubernetes-k8s-ithelp-day21-readme.md].
Resource Units¶
Kubernetes abstracts underlying processor architectures into compute resources exposed as basic units^[400-devops-06-kubernetes-k8s-ithelp-day21-readme.md].
- CPU: The unit is based on cores. One CPU unit is equivalent to:
- 1 AWS vCPU
- 1 GCP Core
- 1 Azure vCore
- 1 Intel Hyperthread (on supported hardware)^[400-devops-06-kubernetes-k8s-ithelp-day21-readme.md]
- Memory: The unit is based on bytes. Values can be expressed as plain integers or with suffixes (E, P, T, G, M, K)^[400-devops-06-kubernetes-k8s-ithelp-day21-readme.md].
Quality of Service (QoS) Classes¶
Based on the request and limit settings, Kubernetes assigns a specific Quality of Service (QoS) class to each Pod.^[400-devops-06-kubernetes-k8s-ithelp-day21-readme.md] This classification dictates the priority and eviction behavior of the Pod when system resources are constrained^[400-devops-06-kubernetes-k8s-ithelp-day21-readme.md].
Guaranteed¶
A Pod is classified as Guaranteed if every container within it meets the following conditions^[400-devops-06-kubernetes-k8s-ithelp-day21-readme.md]:
* request.memory == limit.memory
* request.cpu == limit.cpu
Behavior: These Pods have the highest priority. They are typically not killed or throttled unless they exceed their resource limits and no lower-priority Pods can be evicted^[400-devops-06-kubernetes-k8s-ithelp-day21-readme.md].
Burstable¶
A Pod is classified as Burstable if^[400-devops-06-kubernetes-k8s-ithelp-day21-readme.md]: * It is not a Guaranteed Pod. * At least one container in the Pod has a memory or CPU request set.
Behavior: These Pods have a minimum resource guarantee but can use more resources if available. In the absence of BestEffort Pods, if system capacity is insufficient, Burstable Pods are the first to be killed^[400-devops-06-kubernetes-k8s-ithelp-day21-readme.md].
BestEffort¶
A Pod is classified as BestEffort if no containers within it have any request or limit values set^[400-devops-06-kubernetes-k8s-ithelp-day21-readme.md].
Behavior: These Pods have the lowest priority. When system memory is insufficient, they are the first candidates for eviction^[400-devops-06-kubernetes-k8s-ithelp-day21-readme.md].
Default Constraints and Policies¶
By default, containers in Kubernetes run with unlimited resources^[400-devops__06-Kubernetes__k8s-ithelp__Day23__README.md]. To enforce constraints within a specific scope, administrators can use LimitRange policies attached to a [[Namespaces|Namespace]]^[400-devops__06-Kubernetes__k8s-ithelp__Day23__README.md].
LimitRange policies can^[400-devops__06-Kubernetes__k8s-ithelp__Day23__README.md]: * Enforce minimum and maximum resource usage per Pod or Container. * Constrain the ratio of request to limit. * Set default request/limit values for the Namespace, which are automatically injected into containers that do not define their own^[400-devops__06-Kubernetes__k8s-ithelp__Day23__README.md].
When a Pod is created without explicit resource settings in a Namespace with a LimitRange, the cluster applies the default values. If a Pod explicitly requests resources that violate the defined min/max constraints, creation is forbidden^[400-devops__06-Kubernetes__k8s-ithelp__Day23__README.md].
Related Concepts¶
- [[Namespaces]]
- [[LimitRange]]
- Kubernetes
Sources¶
400-devops-06-kubernetes-k8s-ithelp-day21-readme.md400-devops__06-Kubernetes__k8s-ithelp__Day23__README.md