We are setting up a Kubernetes cluster on CoreOS to run 50+ different applications (mostly Java apps), each of which produces logs potentially in its own format.
We're looking to centralise logs from all containers with the Elastic Stack (formerly ELK stack), with some specific requirements:
- Reliability in the face of network, container or node failures.
- Exactly-once processing for each log statement. Even if a failure occurred, the solution must remember where it left off and it should continue dispatching logs from that point onwards, once healthy again.
- Configure log grokking patterns within the configuration of the pod/replication controller. We'd like to avoid configuring app-specific patterns in a centralised, common element (e.g. Logstash).
The solution we have come up with is to use:
- Logspout to consume Docker logs – forwarding them to a local...
- rsyslog daemon (e.g.
syslog://localhost:514
), which would forward them to a... - Logstash instance running in the cluster and exposed via a Kubernetes Service.
- Logstash would filter and transform the log files based on container ID, and would dispatch the result to Elasticsearch.
Would this solution be reliable? This solution seems to cover all our requirements except for #3 (grokking patterns alongside apps).
Do you have any suggestions? We are open to using Filebeat, fluentd or other components.