Prometheus monitoring architecture¶

Prometheus is an open-source monitoring and alerting system originally derived from Google's BorgMon (part of the Borgmon system).^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md] It utilizes a pull-based model to collect time-series data, scraping Metrics from targets via HTTP and storing them in a Time Series Database (TSDB) for time-based retrieval^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md].

Core Architecture¶

The Prometheus architecture centers on several key components that handle data collection, storage, alerting, and visualization^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md].

Data Collection and Retrieval¶

The core workflow relies on a Pull mechanism, where the Prometheus server actively scrapes Metrics from configured targets at regular intervals^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md].

Prometheus Server: The central component responsible for scraping Metrics, storing them in the TSDB, and processing queries^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md].
Service Discovery: Allows Prometheus to automatically detect and monitor targets, such as Kubernetes Pods or nodes, reducing manual configuration^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md].

Metrics Sources¶

Prometheus aggregates data from various sources through Exporters, which expose Metrics on HTTP endpoints for the server to scrape^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md].

Node Exporter: An agent deployed on hosts to expose hardware and OS-level Metrics, such as CPU and memory usage^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md].
Kubernetes Components: Metrics are directly scraped from the /metrics APIs of core components like kube-apiserver and kubelet, providing insights into work queues (e.g., Controller queue length), QPS, and latency^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md].
Kubernetes Objects: The Metrics Server provides core Metrics (Pod, Node, container stats) and acts as the primary successor to Heapster in the Kubernetes ecosystem^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md].

Push Gateway¶

For scenarios where targets cannot be scraped (e.g., short-lived jobs), the Pushgateway serves as an intermediary. These targets actively push their Metrics to the Pushgateway, which Prometheus then scrapes^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md].

Alerting and Visualization¶

Alertmanager: A standalone component that receives alerts from the Prometheus server. It handles deduplication, grouping, and routing notifications to configured receivers^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md].
Grafana: A visualization tool used to create flexible, configurable dashboards for the Metrics stored in Prometheus^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md].

Monitoring Methodology¶

When designing monitoring Metrics, it is recommended to follow industry-standard methodologies^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md]:

USE Method: Focuses on Resources.
- Utilization: The average time a resource is working.
- Saturation: How full the resource is (e.g., queue length).
- Errors: The count of errors.
RED Method: Focuses on Services (Request, Error, Duration).
- Rate: Requests per second.
- Errors: Errors per second.
- Duration: Request response time (latency).

[[Logging architecture]]
[[Kubernetes ecosystem]]

Sources¶

400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md