ELK stack architecture¶
The ELK stack architecture is a centralized, distributed logging solution designed to handle the massive volume and distribution of logs generated by containerized environments, such as Kubernetes.^[400-devops__06-Kubernetes__k8s-paas__07.Promtheus监控k8s企业级应用.md]
Overview¶
In cloud-native environments, where containers are constantly created, destroyed, migrated, or scaled, it is impractical to log into individual servers to troubleshoot issues.^[400-devops__06-Kubernetes__k8s-paas__07.Promtheus监控k8s企业级应用.md] The ELK stack addresses this by providing a system to collect, transmit, store, analyze, and visualize log data efficiently.
The core stack consists of four main components (often referred to as ELK or ELKB):^[400-devops__06-Kubernetes__k8s-paas__07.Promtheus监控k8s企业级应用.md]
- Elasticsearch: A distributed search engine and analytics engine responsible for storing data.
- Logstash: A data processing pipeline that ingests data from multiple sources, transforms it, and sends it to a "stash" (usually Elasticsearch).
- Kibana: A visualization layer that provides a Web UI for analyzing and viewing data stored in Elasticsearch.
- Filebeat: A lightweight log shipper that runs on edge nodes to collect logs and forward them.
Architecture Workflow¶
The data flow in this architecture typically follows a Publish-Subscribe pattern to ensure reliability and decoupling:
- Collection: A lightweight Filebeat agent runs on the host (often alongside the application container in a sidecar pattern) to collect log streams and output them to a message queue.^[400-devops__06-Kubernetes__k8s-paas__07.Promtheus监控k8s企业级应用.md]
- Buffering: Logs are published to Kafka, a high-throughput distributed message queue.^[400-devops__06-Kubernetes__k8s-paas__07.Promtheus监控k8s企业级应用.md] Kafka serves as a buffer, handling the "Topic" based data organization.
- Processing: Logstash consumes logs from Kafka (asynchronously), filters them, and processes the data before forwarding it to Elasticsearch.^[400-devops__06-Kubernetes__k8s-paas__07.Promtheus监控k8s企业级应用.md]
- Storage: Processed logs are stored in Elasticsearch.
- Visualization: Users interact with Kibana to search, analyze, and visualize the log data.^[400-devops__06-Kubernetes__k8s-paas__07.Promtheus监控k8s企业级应用.md]
Component Details¶
Filebeat¶
Filebeat is a lightweight, resource-efficient log collector installed on servers.^[400-devops__06-Kubernetes__k8s-paas__07.Promtheus监控k8s企业级应用.md] In Kubernetes deployments, it is often deployed as a sidecar container within the same Pod as the business application. It monitors specific log paths (e.g., /logm/*.log) and pushes data to Kafka topics.
Kafka¶
Kafka is used as the middleware to decouple log collection from log processing.^[400-devops__06-Kubernetes__k8s-paas__07.Promtheus监控k8s企业级应用.md] It ensures that log data is not lost if downstream services (like Logstash) are unavailable and handles high-velocity data streams efficiently.
Logstash¶
Logstash acts as the data processing pipeline.^[400-devops__06-Kubernetes__k8s-paas__07.Promtheus监控k8s企业级应用.md] It consumes data from Kafka, applies filters (such as parsing JSON), normalizes data, and indexes it into Elasticsearch. Different Logstash instances can be configured for different environments (e.g., logstash-test vs. logstash-prod).
Elasticsearch¶
Elasticsearch is the backend storage and search engine.^[400-devops__06-Kubernetes__k8s-paas__07.Promtheus监控k8s企业级应用.md] It stores logs in structured indices (e.g., k8s-test-YYYY.MM.DD) and provides the APIs used by Kibana to query data.
Kibana¶
Kibana is the user interface.^[400-devops__06-Kubernetes__k8s-paas__07.Promtheus监控k8s企业级应用.md] It allows developers and operators to view logs, filter by time or environment, and search for specific error messages (e.g., "exception") without needing server access.