Skip to content

Prometheus monitoring architecture

Prometheus is an open-source monitoring and alerting system originally derived from Google's BorgMon (part of the Borgmon system).^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md] It utilizes a pull-based model to collect time-series data, scraping Metrics from targets via HTTP and storing them in a Time Series Database (TSDB) for time-based retrieval^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md].

Core Architecture

The Prometheus architecture centers on several key components that handle data collection, storage, alerting, and visualization^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md].

Data Collection and Retrieval

The core workflow relies on a Pull mechanism, where the Prometheus server actively scrapes Metrics from configured targets at regular intervals^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md].

  • Prometheus Server: The central component responsible for scraping Metrics, storing them in the TSDB, and processing queries^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md].
  • Service Discovery: Allows Prometheus to automatically detect and monitor targets, such as Kubernetes Pods or nodes, reducing manual configuration^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md].

Metrics Sources

Prometheus aggregates data from various sources through Exporters, which expose Metrics on HTTP endpoints for the server to scrape^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md].

  • Node Exporter: An agent deployed on hosts to expose hardware and OS-level Metrics, such as CPU and memory usage^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md].
  • Kubernetes Components: Metrics are directly scraped from the /metrics APIs of core components like kube-apiserver and kubelet, providing insights into work queues (e.g., Controller queue length), QPS, and latency^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md].
  • Kubernetes Objects: The Metrics Server provides core Metrics (Pod, Node, container stats) and acts as the primary successor to Heapster in the Kubernetes ecosystem^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md].

Push Gateway

For scenarios where targets cannot be scraped (e.g., short-lived jobs), the Pushgateway serves as an intermediary. These targets actively push their Metrics to the Pushgateway, which Prometheus then scrapes^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md].

Alerting and Visualization

  • Alertmanager: A standalone component that receives alerts from the Prometheus server. It handles deduplication, grouping, and routing notifications to configured receivers^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md].
  • Grafana: A visualization tool used to create flexible, configurable dashboards for the Metrics stored in Prometheus^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md].

Monitoring Methodology

When designing monitoring Metrics, it is recommended to follow industry-standard methodologies^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md]:

  • USE Method: Focuses on Resources.
    • Utilization: The average time a resource is working.
    • Saturation: How full the resource is (e.g., queue length).
    • Errors: The count of errors.
  • RED Method: Focuses on Services (Request, Error, Duration).
    • Rate: Requests per second.
    • Errors: Errors per second.
    • Duration: Request response time (latency).
  • [[Logging architecture]]
  • [[Kubernetes ecosystem]]

Sources