Parse specific application messages out of container logs with Fluentd on Kubernetes

11/21/2018

I'm trying to parse messages from multiple applications from a single container inside a kubernetes pod using fluentd... Fluentd, Kibana and Elasticsearch are working well and I have all my logs showing up and am otherwise happy. However, I need to process a series of container log differently. In this case the logs I need to further parse are all in a single namespace. My container log looks something like the following:

somekeyword1 somenumber somestuff optionals
somekeyword2 somenumber somestuff optionals
somekeyword2 somenumber somestuff optionals
somekeyword1 somenumber somestuff optionals
...

I would like to process that log with a regex based on the original somekeywordN (but am open to suggestions) So within fluentd, how do I perform that level of processing on a log message for containers in a particular namespace?

I've tried this: https://github.com/fabric8io/fluent-plugin-kubernetes_metadata_filter/issues/73
but even given that, I'm not sure how to run my regexes on a particular log

-- Ryan
fluentd
kubernetes

2 Answers

11/22/2018

If you are using Elasticsearch version 5 or above , I would suggest using the Ingest APIs and apply Grok Filtering on your log data using its Grok Processor.

If you are using a previous version of Elastic stack , you can also try to use Logstash to achieve the same.

You might also find this article helpful ( Disclaimer - the article is posted my me on Medium sometime back.)

-- fatcook
Source: StackOverflow

11/30/2018

Fleutnd has a build in regex parser you can use. https://docs.fluentd.org/v1.0/articles/parser_regexp

Simply add the following snippet under you filter section of the config

<parse>
  @type regexp
  expression /^(?<key1>somekeyword.) (?<key2>.+) (?<key3>.+) /
</parse>

Or, if you need to match using multiple regex, this is the plugin I use: https://github.com/repeatedly/fluent-plugin-multi-format-parser

-- Karis
Source: StackOverflow