Kubernetes custom resources for autoscaling¶

In Kubernetes, many autoscaling capabilities are implemented through Custom Resources (CRs) and Custom Resource Definitions (CRDs) rather than as part of the core system.^[400-devops__06-Kubernetes__k8s-ithelp__Day25__README.md] This modular architecture allows specialized features to be added or updated rapidly, though it often requires manual installation of components like the Metrics Server or Vertical Pod Autoscaler (VPA), as they are not always included in default distributions.^[400-devops__06-Kubernetes__k8s-ithelp__Day25__README.md]

Vertical Pod Autoscaler (VPA)¶

The Vertical Pod Autoscaler (VPA) is a custom resource designed to automate the allocation of CPU and memory resources.^[400-devops__06-Kubernetes__k8s-ithelp__Day25__README.md] It analyzes usage history to recommend and apply optimal requests and limits, removing the need for manual tuning.^[400-devops__06-Kubernetes__k8s-ithelp__Day25__README.md]

To apply changes, VPA typically restarts the Pods with updated configurations.^[400-devops__06-Kubernetes__k8s-ithelp__Day25__README.md] Because VPA modifies resource requests directly, it generally cannot be used simultaneously with the Horizontal Pod Autoscaler (HPA) on standard metrics (CPU/Memory) unless the HPA is configured to use custom metrics or a multidimensional scaling strategy is employed.^[400-devops__06-Kubernetes__k8s-ithelp__Day25__README.md]

Metrics Server¶

The Metrics Server is a foundational custom resource that serves as the primary source of resource usage data, such as CPU and memory utilization.^[400-devops__06-Kubernetes__k8s-ithelp__Day25__README.md] Autoscalers depend on this data to make scaling decisions; without the Metrics Server installed, components like the HPA and VPA cannot function.^[400-devops__06-Kubernetes__k8s-ithelp__Day25__README.md]

Multidim Pod Autoscaler (MPA)¶

Multidim Pod Autoscaler (MPA) is an advanced autoscaling capability, currently available primarily on platforms like GCP GKE, that combines horizontal and vertical strategies.^[400-devops__06-Kubernetes__k8s-ithelp__Day25__README.md] It typically handles scaling based on CPU via Horizontal Pod Autoscaler and memory via VPA simultaneously.^[400-devops__06-Kubernetes__k8s-ithelp__Day25__README.md]

This feature highlights the rapid evolution of Kubernetes autoscaling APIs. Because the open-source community had not yet standardized multidimensional scaling at the time of writing, cloud providers implemented it as a custom feature.^[400-devops__06-Kubernetes__k8s-ithelp__Day25__README.md]

Sources¶

^[400-devops__06-Kubernetes__k8s-ithelp__Day25__README.md]