Kubernetes log storage management¶
Kubernetes log storage management involves the strategies and architectures used to collect, store, and manage application logs generated within a cluster.^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md] The system design, often referred to as cluster-level-logging, aims to ensure that logs remain accessible and persist regardless of the lifecycle of the containers, Pods, or Nodes where they originated^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md].
This independence means that logs can be retrieved even if a container crashes, a Pod is deleted, or a Node fails^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md].
Common Architectures¶
There are three primary architectural patterns for implementing log storage management in Kubernetes, each with specific trade-offs regarding resource usage, compatibility, and complexity^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md].
1. Node-Level Logging Agent¶
In this pattern, a logging agent is deployed on every Node in the cluster^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md]. The agent is typically run as a DaemonSet and mounts the host's log directory (e.g., /var/log/containers) to access container log files^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md].
- Mechanism: The agent detects new log files and forwards them to a backend storage system (e.g., Elasticsearch, S3)^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md].
- Advantages: This is a resource-efficient approach as only one agent is required per Node. It is non-intrusive to applications and Pods^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md].
- Constraints: This method requires applications to output logs to stdout and stderr^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md].
High-volume logging can exhaust the system's log driver quota, potentially causing logs to be dropped^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md].
To mitigate this, administrators may need to increase log quotas or mount Persistent Volumes to the container to handle high throughput^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md].
2. Sidecar with Log Redirection (stdout/stderr)¶
This variant is used when an application writes logs to specific files instead of standard output, but the operator still wishes to use the Node-level agent infrastructure^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md].
- Mechanism: A sidecar container runs alongside the application container within the same Pod. This sidecar reads the application's log files and redirects them to its own
stdoutandstderrstreams^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md]. The Node-level agent then captures these logs as usual. - Disadvantages: This results in duplicate log storage on the host: one copy is the original file written by the application, and the other is the JSON log file corresponding to the sidecar's output^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md]. This wastes significant disk space and is generally discouraged unless the application container cannot be modified^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md].
3. Sidecar with Direct Shipping¶
In this pattern, a sidecar container handles the direct transmission of logs to remote storage, bypassing the Node-level agent^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md].
- Mechanism: The sidecar container runs a logging driver (e.g., fluentd) and sends data directly to a backend like Elasticsearch^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md].
- Advantages: It allows applications to write to fixed files without needing
stdout/stderrand is considered simple to deploy^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md]. - Disadvantages: The sidecar may consume significant resources, potentially impacting the performance of the application container^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md]. Additionally, because logs are not streamed to
stdout, they will not be visible viakubectl logs^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md].
Storage Management Considerations¶
Regardless of the chosen architecture, persistent log files accumulate on the host machine^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md].
To prevent the main disk partition from filling up—which can lead to system crashes—administrators must implement a log rotation or cleanup policy^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md].
Strategies include: * Regularly clearing log files from the host^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md]. * Mounting the log directory to a dedicated high-capacity remote storage volume^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md].
Related Concepts¶
- Prometheus: While Prometheus focuses on Metrics, it is part of the broader monitoring and observability ecosystem that often runs alongside logging stacks^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md].
- [[DaemonSet]]: The controller type typically used to deploy logging agents on every Node^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md].
- [[Sidecar Pattern]]: The multi-container pattern used to extend the functionality of a primary container for log processing.
Sources¶
^[400-devops__06-Kubernetes__k8s-paas__原理及源码解析__Kubernetes相关生态.md]