Skip to content

Kubernetes Request and Limit

In Kubernetes, Requests and Limits are mechanisms used to manage compute resources like CPU and memory. Since a Kubernetes cluster manages resources across multiple nodes, it must track usage and allocate resources efficiently to ensure containers can run reliably while maximizing utility^[400-devops__06-Kubernetes__k8s-ithelp__Day21__README.md].

Definitions

Request

The Request is the minimum amount of a resource guaranteed to a container.^[400-devops__06-Kubernetes__k8s-ithelp__Day21__README.md] It acts as a dependency for the scheduler: a Pod will only be scheduled to a node if the node's available resources meet the Pod's request values^[400-devops__06-Kubernetes__k8s-ithelp__Day21__README.md].

Limit

The Limit is the maximum amount of a resource a container is allowed to use^[400-devops__06-Kubernetes__k8s-ithelp__Day21__README.md]. If a limit is set to 0 (or omitted), the container can consume unlimited resources, potentially up to the node's full capacity^[400-devops__06-Kubernetes__k8s-ithelp__Day21__README.md].

Relationship

The configuration of these values must adhere to specific logic to be valid^[400-devops__06-Kubernetes__k8s-ithelp__Day21__README.md]:

  • Request: Must be greater than or equal to 0 and less than or equal to the Node's allocatable capacity.
    • Formula: 0 <= request <= Node Allocatable
  • Limit: Must be greater than or equal to the request.
    • Formula: request <= limit <= Infinity

When the limit is set higher than the request, the container is guaranteed the requested amount but can burst up to the limit if resources are available^[400-devops__06-Kubernetes__k8s-ithelp__Day21__README.md].

Resource Types

Kubernetes abstracts hardware into standard units for scheduling^[400-devops__06-Kubernetes__k8s-ithelp__Day21__README.md]:

  • CPU: The unit is "cores". One CPU unit is equivalent to:
    • 1 AWS vCPU
    • 1 GCP Core
    • 1 Azure vCore
    • 1 Intel Hyperthread (on supporting hardware)
  • Memory: The unit is bytes. Values can be expressed as plain integers or with suffixes (E, P, T, G, M, K) representing powers of 1024 or 1000^[400-devops__06-Kubernetes__k8s-ithelp__Day21__README.md].

Quality of Service (QoS) Classes

Based on the Request and Limit settings, Kubernetes assigns a Quality of Service (QoS) class to each Pod to determine scheduling priority and eviction behavior^[400-devops__06-Kubernetes__k8s-ithelp__Day21__README.md].

Guaranteed

A Pod is classified as Guaranteed if every container within it meets the following condition^[400-devops__06-Kubernetes__k8s-ithelp__Day21__README.md]: * request.cpu == limit.cpu * request.memory == limit.memory

These Pods have the highest priority and are the last to be evicted during resource exhaustion, unless they exceed their limits^[400-devops__06-Kubernetes__k8s-ithelp__Day21__README.md].

Burstable

A Pod is classified as Burstable if^[400-devops__06-Kubernetes__k8s-ithelp__Day21__README.md]: * It is not Guaranteed. * At least one container in the Pod has a request set (for either CPU or Memory).

These Pods have a guaranteed minimum resource but can use more if available. When the system is under memory pressure and no BestEffort Pods exist, Burstable Pods are the first to be targeted for eviction^[400-devops__06-Kubernetes__k8s-ithelp__Day21__README.md].

BestEffort

A Pod is classified as BestEffort if^[400-devops__06-Kubernetes__k8s-ithelp__Day21__README.md]: * No containers in the Pod have any requests or limits set.

These Pods have the lowest priority. In scenarios where system resources are insufficient, they are the first to be killed or evicted to free up capacity^[400-devops__06-Kubernetes__k8s-ithelp__Day21__README.md].

Sources

  • 400-devops__06-Kubernetes__k8s-ithelp__Day21__README.md