Skip to content

Kubernetes resource requests and limits

Kubernetes resource requests and limits are mechanisms used to manage compute resources (CPU and Memory) within a cluster.^[400-devops-06-kubernetes-k8s-ithelp-day21-readme.md] Since Kubernetes functions as a cluster management platform comprising nodes that act as physical hosts, it must track platform-wide resource usage to allocate resources efficiently to containers and ensure they have sufficient resources to operate throughout their lifecycle.^[400-devops-06-kubernetes-k8s-ithelp-day21-readme.md]

Core Concepts

Kubernetes provides two primary types of constraints to ensure resources are scheduled effectively and fairly while maximizing utilization^[400-devops-06-kubernetes-k8s-ithelp-day21-readme.md]:

Requests

A request specifies the minimum resource requirements for a container^[400-devops-06-kubernetes-k8s-ithelp-day21-readme.md]. * It acts as the basis for scheduling decisions; a container will only be scheduled to a node if the node's allocatable resources are greater than or equal to the requested amount^[400-devops-06-kubernetes-k8s-ithelp-day21-readme.md]. * The value must be greater than or equal to zero and cannot exceed the node's capacity^[400-devops-06-kubernetes-k8s-ithelp-day21-readme.md].

Limits

A limit defines the maximum amount of resources a container can consume^[400-devops-06-kubernetes-k8s-ithelp-day21-readme.md]. * A limit set to 0 indicates no constraint is applied, allowing for infinite usage^[400-devops-06-kubernetes-k8s-ithelp-day21-readme.md]. * The limit value must be greater than or equal to the request, but it has no upper bound^[400-devops-06-kubernetes-k8s-ithelp-day21-readme.md].

Resource Units

Kubernetes abstracts underlying processor architectures into compute resources exposed as basic units^[400-devops-06-kubernetes-k8s-ithelp-day21-readme.md].

  • CPU: The unit is based on cores. One CPU unit is equivalent to:
    • 1 AWS vCPU
    • 1 GCP Core
    • 1 Azure vCore
    • 1 Intel Hyperthread (on supported hardware)^[400-devops-06-kubernetes-k8s-ithelp-day21-readme.md]
  • Memory: The unit is based on bytes. Values can be expressed as plain integers or with suffixes (E, P, T, G, M, K)^[400-devops-06-kubernetes-k8s-ithelp-day21-readme.md].

Quality of Service (QoS) Classes

Based on the request and limit settings, Kubernetes assigns a specific Quality of Service (QoS) class to each Pod.^[400-devops-06-kubernetes-k8s-ithelp-day21-readme.md] This classification dictates the priority and eviction behavior of the Pod when system resources are constrained^[400-devops-06-kubernetes-k8s-ithelp-day21-readme.md].

Guaranteed

A Pod is classified as Guaranteed if every container within it meets the following conditions^[400-devops-06-kubernetes-k8s-ithelp-day21-readme.md]: * request.memory == limit.memory * request.cpu == limit.cpu

Behavior: These Pods have the highest priority. They are typically not killed or throttled unless they exceed their resource limits and no lower-priority Pods can be evicted^[400-devops-06-kubernetes-k8s-ithelp-day21-readme.md].

Burstable

A Pod is classified as Burstable if^[400-devops-06-kubernetes-k8s-ithelp-day21-readme.md]: * It is not a Guaranteed Pod. * At least one container in the Pod has a memory or CPU request set.

Behavior: These Pods have a minimum resource guarantee but can use more resources if available. In the absence of BestEffort Pods, if system capacity is insufficient, Burstable Pods are the first to be killed^[400-devops-06-kubernetes-k8s-ithelp-day21-readme.md].

BestEffort

A Pod is classified as BestEffort if no containers within it have any request or limit values set^[400-devops-06-kubernetes-k8s-ithelp-day21-readme.md].

Behavior: These Pods have the lowest priority. When system memory is insufficient, they are the first candidates for eviction^[400-devops-06-kubernetes-k8s-ithelp-day21-readme.md].

Default Constraints and Policies

By default, containers in Kubernetes run with unlimited resources^[400-devops__06-Kubernetes__k8s-ithelp__Day23__README.md]. To enforce constraints within a specific scope, administrators can use LimitRange policies attached to a [[Namespaces|Namespace]]^[400-devops__06-Kubernetes__k8s-ithelp__Day23__README.md].

LimitRange policies can^[400-devops__06-Kubernetes__k8s-ithelp__Day23__README.md]: * Enforce minimum and maximum resource usage per Pod or Container. * Constrain the ratio of request to limit. * Set default request/limit values for the Namespace, which are automatically injected into containers that do not define their own^[400-devops__06-Kubernetes__k8s-ithelp__Day23__README.md].

When a Pod is created without explicit resource settings in a Namespace with a LimitRange, the cluster applies the default values. If a Pod explicitly requests resources that violate the defined min/max constraints, creation is forbidden^[400-devops__06-Kubernetes__k8s-ithelp__Day23__README.md].

Sources

  • 400-devops-06-kubernetes-k8s-ithelp-day21-readme.md
  • 400-devops__06-Kubernetes__k8s-ithelp__Day23__README.md