Kubernetes compute resource units¶
Kubernetes compute resource units are the standardized Metrics used to measure and allocate processing power and memory within a cluster^[400-devops__06-Kubernetes__k8s-ithelp__Day21__README.md]. To manage these resources effectively, Kubernetes abstracts the underlying hardware architecture and exposes resources as basic units that can be requested and limited^[400-devops__06-Kubernetes__k8s-ithelp__Day21__README.md].
Core Resource Types¶
Kubernetes primarily manages two types of compute resources:
- CPU: Represented in units based on cores^[400-devops__06-Kubernetes__k8s-ithelp__Day21__README.md]. One CPU unit is equivalent to:
- 1 AWS vCPU
- 1 GCP Core
- 1 Azure vCore
- 1 Hyperthread on a supported Intel processor^[400-devops__06-Kubernetes__k8s-ithelp__Day21__README.md]
- Memory: Represented in units based on bytes^[400-devops__06-Kubernetes__k8s-ithelp__Day21__README.md]. Memory can be expressed as a plain integer or with a suffix indicating the unit (e.g., E, P, T, G, M, K)^[400-devops__06-Kubernetes__k8s-ithelp__Day21__README.md].
Resource Management: Requests and Limits¶
To allocate these resources, Kubernetes utilizes requests and limits to balance priority and fairness^[400-devops__06-Kubernetes__k8s-ithelp__Day21__README.md].
Request¶
- Represents the minimum amount of resources required by a container^[400-devops__06-Kubernetes__k8s-ithelp__Day21__README.md].
- Serves as the primary dependency for the scheduler; a Pod is only scheduled to a node if the node's allocatable resources are greater than or equal to the request^[400-devops__06-Kubernetes__k8s-ithelp__Day21__README.md].
- Formula:
0 <= request <= Node Allocatable^[400-devops__06-Kubernetes__k8s-ithelp__Day21__README.md]
Limit¶
- Represents the maximum amount of resources a container can use^[400-devops__06-Kubernetes__k8s-ithelp__Day21__README.md].
- Setting a limit to 0 indicates no cap is placed on resource usage^[400-devops__06-Kubernetes__k8s-ithelp__Day21__README.md].
- Formula:
request <= limit <= Infinity^[400-devops__06-Kubernetes__k8s-ithelp__Day21__README.md]
Impact on Quality of Service (QoS)¶
The configuration of requests and limits determines the Pod's Quality of Service (QoS) class, which dictates its priority and eviction behavior when resources are scarce^[400-devops__06-Kubernetes__k8s-ithelp__Day21__README.md].
Guaranteed¶
- Assigned when
request.memory == limit.memoryANDrequest.cpu == limit.cpu^[400-devops__06-Kubernetes__k8s-ithelp__Day21__README.md]. - These Pods have the highest priority and are generally not killed or throttled unless they exceed their limits and no lower-priority Pods can be evicted^[400-devops__06-Kubernetes__k8s-ithelp__Day21__README.md].
Burstable¶
- Assigned if the Pod is not
Guaranteedbut at least one container has set arequestfor memory or CPU^[400-devops__06-Kubernetes__k8s-ithelp__Day21__README.md]. - These Pods have a minimum resource guarantee but can use more resources if available^[400-devops__06-Kubernetes__k8s-ithelp__Day21__README.md].
- If no
BestEffortPods exist and system capacity is insufficient, these are the first to be killed^[400-devops__06-Kubernetes__k8s-ithelp__Day21__README.md].
BestEffort¶
- Assigned when all containers in a Pod have no
requestorlimitset^[400-devops__06-Kubernetes__k8s-ithelp__Day21__README.md]. - These have the lowest priority^[400-devops__06-Kubernetes__k8s-ithelp__Day21__README.md].
- They are the first targets for eviction when system memory is insufficient^[400-devops__06-Kubernetes__k8s-ithelp__Day21__README.md].
Related Concepts¶
- [[SOP]]: Creating Standard Operating Procedures for setting resource limits to prevent resource exhaustion.
- [[流程化筆記]]: documenting the workflow for calculating appropriate Requests and Limits for new deployments.
Sources¶
^[400-devops__06-Kubernetes__k8s-ithelp__Day21__README.md]