Skip to content

Kubernetes QoS Model

The Kubernetes QoS (Quality of Service) Model is a classification system used to determine the priority and stability of Pods based on their resource requests and limits.^[400-devops-06-kubernetes-k8s-paas-kubernetes.md] These classes are primarily used by the kubelet to decide which Pods to evict when a node is under resource pressure (Memory or Disk Pressure).^[400-devops-06-kubernetes-k8s-paas-kubernetes.md]

Kubernetes classifies Pods into three QoS classes: Guaranteed, Burstable, and BestEffort.^[400-devops-06-kubernetes-k8s-paas-kubernetes.md] These categories directly influence the Eviction order, where BestEffort Pods are the first to be terminated, followed by Burstable, and finally Guaranteed.^[400-devops-06-kubernetes-k8s-paas-kubernetes.md]

QoS Classes

Guaranteed

A Pod is classified as Guaranteed if every container within it meets the following conditions^[400-devops-06-kubernetes-k8s-paas-kubernetes.md]:

  1. Both requests and limits are set (for CPU and Memory).
  2. The requests value is equal to the limits value.

These Pods receive the highest priority and are only evicted if their resource usage exceeds their defined limits or if the node is in a critical Memory Pressure state.^[400-devops-06-kubernetes-k8s-paas-kubernetes.md]

Example Configuration^[400-devops-06-kubernetes-k8s-paas-kubernetes.md]:

resources:
  limits:
    memory: "200Mi"
    cpu: "700m"
  requests:
    memory: "200Mi"
    cpu: "700m"

CPU Optimization (cpuset) When a Pod is Guaranteed and its CPU request/limit is an integer, the kubelet can bind the Pod to exclusive CPU cores using cpuset.^[400-devops-06-kubernetes-k8s-paas-kubernetes.md] This avoids context switching between CPUs and significantly improves performance for latency-sensitive applications (e.g., DaemonSets).^[400-devops-06-kubernetes-k8s-paas-kubernetes.md]

Burstable

A Pod is classified as Burstable if^[400-devops-06-kubernetes-k8s-paas-kubernetes.md]:

  1. It does not meet the criteria for Guaranteed.
  2. At least one container in the Pod has a requests or limits set.

These Pods have a lower priority than Guaranteed Pods. They are typically evicted only after BestEffort Pods and when resource usage exceeds their requests (though the specific eviction logic involves comparing usage to requests and available node capacity).^[400-devops-06-kubernetes-k8s-paas-kubernetes.md]

Example Configuration^[400-devops-06-kubernetes-k8s-paas-kubernetes.md]:

resources:
  limits:
    memory: "200Mi"
  requests:
    memory: "100Mi"

BestEffort

A Pod is classified as BestEffort if^[400-devops-06-kubernetes-k8s-paas-kubernetes.md]:

  • No container in the Pod has requests or limits set.

These Pods receive the lowest priority. They are the first candidates for eviction when the node faces resource scarcity.^[400-devops-06-kubernetes-k8s-paas-kubernetes.md]

Example Configuration^[400-devops-06-kubernetes-k8s-paas-kubernetes.md]:

# No resources section defined

  • [[Kubernetes Scheduling]]: The kube-scheduler uses the requests value to filter nodes, while limits are used by the kubelet for throttling or OOM killing.
  • Kubernetes Resource Model: Defines requests (scheduling) vs limits (isolation/throttling).
  • [[Cgroups]]: The underlying Linux kernel feature used to enforce these resource limits.

Sources

^[400-devops-06-kubernetes-k8s-paas-kubernetes.md]