Prometheus monitoring architecture¶
Prometheus is an open-source monitoring and alerting system originally derived from Google's BorgMon (part of the Borgmon system).^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md] It utilizes a pull-based model to collect time-series data, scraping Metrics from targets via HTTP and storing them in a Time Series Database (TSDB) for time-based retrieval^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md].
Core Architecture¶
The Prometheus architecture centers on several key components that handle data collection, storage, alerting, and visualization^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md].
Data Collection and Retrieval¶
The core workflow relies on a Pull mechanism, where the Prometheus server actively scrapes Metrics from configured targets at regular intervals^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md].
- Prometheus Server: The central component responsible for scraping Metrics, storing them in the TSDB, and processing queries^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md].
- Service Discovery: Allows Prometheus to automatically detect and monitor targets, such as Kubernetes Pods or nodes, reducing manual configuration^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md].
Metrics Sources¶
Prometheus aggregates data from various sources through Exporters, which expose Metrics on HTTP endpoints for the server to scrape^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md].
- Node Exporter: An agent deployed on hosts to expose hardware and OS-level Metrics, such as CPU and memory usage^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md].
- Kubernetes Components: Metrics are directly scraped from the
/metricsAPIs of core components likekube-apiserverandkubelet, providing insights into work queues (e.g., Controller queue length), QPS, and latency^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md]. - Kubernetes Objects: The Metrics Server provides core Metrics (Pod, Node, container stats) and acts as the primary successor to Heapster in the Kubernetes ecosystem^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md].
Push Gateway¶
For scenarios where targets cannot be scraped (e.g., short-lived jobs), the Pushgateway serves as an intermediary. These targets actively push their Metrics to the Pushgateway, which Prometheus then scrapes^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md].
Alerting and Visualization¶
- Alertmanager: A standalone component that receives alerts from the Prometheus server. It handles deduplication, grouping, and routing notifications to configured receivers^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md].
- Grafana: A visualization tool used to create flexible, configurable dashboards for the Metrics stored in Prometheus^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md].
Monitoring Methodology¶
When designing monitoring Metrics, it is recommended to follow industry-standard methodologies^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md]:
- USE Method: Focuses on Resources.
- Utilization: The average time a resource is working.
- Saturation: How full the resource is (e.g., queue length).
- Errors: The count of errors.
- RED Method: Focuses on Services (Request, Error, Duration).
- Rate: Requests per second.
- Errors: Errors per second.
- Duration: Request response time (latency).
Related Concepts¶
- [[Logging architecture]]
- [[Kubernetes ecosystem]]