Skip to content

ELK stack architecture for Kubernetes

The ELK Stack is a centralized architecture used for the collection, transmission, storage, analysis, and visualization of logs within containerized environments^[400-devops__06-Kubernetes__k8s-paas__07.Promtheus监控k8s企业级应用.md]. As applications in Kubernetes are frequently created, destroyed, migrated, and scaled, traditional methods of logging into individual machines to check logs become infeasible^[400-devops__06-Kubernetes__k8s-paas__07.Promtheus监控k8s企业级应用.md]. The ELK stack addresses this by providing a unified system to handle the massive volume of log data distributed across the cluster^[400-devops__06-Kubernetes__k8s-paas__07.Promtheus监控k8s企业级应用.md].

Core Components

The acronym ELK refers to the primary open-source tools used in the stack, though modern implementations often include a fourth component (Beats) and an intermediate message queue^[400-devops__06-Kubernetes__k8s-paas__07.Promtheus监控k8s企业级应用.md].

E — Elasticsearch

A distributed search engine that provides the capabilities for log collection, analysis, and storage^[400-devops__06-Kubernetes__k8s-paas__07.Promtheus监控k8s企业级应用.md].

L — Logstash

A tool for collecting, analyzing, and filtering logs^[400-devops__06-Kubernetes__k8s-paas__07.Promtheus监控k8s企业级应用.md]. It acts as a pipeline that ingests data from multiple sources, transforms it, and sends it to a stash (like Elasticsearch).

K — Kibana

A user-friendly Web interface for Logstash and Elasticsearch^[400-devops__06-Kubernetes__k8s-paas__07.Promtheus监控k8s企业级应用.md]. It allows users to aggregate, analyze, and search critical log data visually.

FileBeat

A lightweight log collector and processor^[400-devops__06-Kubernetes__k8s-paas__07.Promtheus监控k8s企业级应用.md]. Due to its low resource usage, it is recommended for deployment on various servers to gather logs and transmit them to Logstash (or Kafka), replacing heavier Logstash forwarders^[400-devops__06-Kubernetes__k8s-paas__07.Promtheus监控k8s企业级应用.md].

Kafka

A high-throughput distributed publish-subscribe messaging system^[400-devops__06-Kubernetes__k8s-paas__07.Promtheus监控k8s企业级应用.md]. It serves as a buffer to decouple log producers (FileBeat) from log consumers (Logstash), handling all action stream data from the website.

Kubernetes Architecture Design

In a Kubernetes environment, the architecture is designed to handle dynamic container lifecycles and high-throughput log streams^[400-devops__06-Kubernetes__k8s-paas__07.Promtheus监控k8s企业级应用.md].

container and Collection

  • Sidecar Pattern: The architecture typically deploys FileBeat as a "Sidecar" container within the same Pod as the business application container^[400-devops__06-Kubernetes__k8s-paas__07.Promtheus监控k8s企业级应用.md].
  • Log Volume: An emptyDir volume is shared between the application container (which writes logs to a specific path, e.g., /opt/tomcat/logs) and the FileBeat container (which reads from that path)^[400-devops__06-Kubernetes__k8s-paas__07.Promtheus监控k8s企业级应用.md]. This tight coupling ensures that logs are collected locally from the disk.
  • Topics: FileBeat collects logs and publishes them to Kafka using specific topics^[400-devops__06-Kubernetes__k8s-paas__07.Promtheus监控k8s企业级应用.md].

Data Flow

The data flow typically follows this sequence^[400-devops__06-Kubernetes__k8s-paas__07.Promtheus监控k8s企业级应用.md]:

  1. Collection: FileBeat running in the Pod collects application logs.
  2. Buffering: Logs are pushed to Kafka. Kafka topics are often organized by environment (e.g., k8s-fb-test-* for the test environment and k8s-fb-prod-* for production)^[400-devops__06-Kubernetes__k8s-paas__07.Promtheus监控k8s企业级应用.md].
  3. Processing: Logstash consumes data from the specified Kafka topics asynchronously^[400-devops__06-Kubernetes__k8s-paas__07.Promtheus监控k8s企业级应用.md]. It performs filtering (e.g., parsing JSON messages) and uploads the processed data to Elasticsearch^[400-devops__06-Kubernetes__k8s-paas__07.Promtheus监控k8s企业级应用.md].
  4. Storage: Elasticsearch stores the data as structured time-series indices^[400-devops__06-Kubernetes__k8s-paas__07.Promtheus监控k8s企业级应用.md]. Index patterns are usually segmented by environment (e.g., k8s-test-YYYY.MM.DD or k8s-prod-YYYY.MM.DD)^[400-devops__06-Kubernetes__k8s-paas__07.Promtheus监控k8s企业级应用.md].
  5. Visualization: Kibana connects to Elasticsearch to display the data^[400-devops__06-Kubernetes__k8s-paas__07.Promtheus监控k8s企业级应用.md]. Users can create index patterns in Kibana matching the stored indices to query and visualize logs^[400-devops__06-Kubernetes__k8s-paas__07.Promtheus监控k8s企业级应用.md].

Infrastructure Deployment

  • Elasticsearch: Typically deployed on dedicated infrastructure (e.g., binary installation on a specific host like hdss7-12.host.com) rather than within the dynamic Kubernetes cluster itself to ensure resource stability^[400-devops__06-Kubernetes__k8s-paas__07.Promtheus监控k8s企业级应用.md].
  • Kafka: Deployed as a separate infrastructure service, often configured with Zookeeper for coordination^[400-devops__06-Kubernetes__k8s-paas__07.Promtheus监控k8s企业级应用.md].
  • Logstash: Can be deployed as a Docker container managed by the Kubernetes cluster or a container orchestrator, configured via pipeline configurations (e.g., logstash-test.conf)[^400-devops__06-Kubernetes__k8s-paas__07.Promtheus监控k8s企业级应用.md].
  • Kibana: Deployed into the Kubernetes infra namespace, exposed via an Ingress Controller (e.g., kibana.od.com)^[400-devops__06-Kubernetes__k8s-paas__07.Promtheus监控k8s企业级应用.md].

Sources

  • 400-devops__06-Kubernetes__k8s-paas__07.Promtheus监控k8s企业级应用.md