Parsing log containing JSON in Fluentd

8/27/2019

I'm using Fluentd to tail container logs in k8s. I'm using the in_tail input plugin to tail container logs, parsing using the regexp parse plugin to extract the relevant log. The log is a json that has to be extracted and sent to the output plugin. I'm doing the extraction using the filter_parser plugin using a json parser.

    <source>
      @type tail
      tag proxy.*
      ...
      read_from_head true
      <parse>
       @type regexp
       expression \[proxy\]\[(?<type>(record|replay))\](?<log>.*)
      </parse>
    </source>

    <filter proxy.**>
      @type parser
      key_name log
      <parse>
        @type json
      </parse>
    </filter>

Example log line:

[proxy][record]{"test":123}

This seems to produce a record as:

record={"type"=>"record", "log"=>"{\\\"test\\\":123}"}

When running the setup with the config, it looks like the value in the log regex group is being escaped (twice?); due to which the json parser in the filter section fails to parse it.

I suspect it has something to do with how the regexp plugin and/or the in_tail has been written. It might be picking that log value as a string and escaping it before passing it on.

Or is my regex missing something?

Does anyone know why this is happening and what the solution might be?

-- Siddhant
fluentd
kubernetes
ruby

0 Answers