Fluentd Regular Expression Matching Error

8/14/2018

I am trying to parse the logs from kubernetes like this for example

2018-08-14 13:21:20.013 [INFO][67] health.go 150: Overall health summary=&health.HealthReport{Live:true, Ready:true}

And this is the configuration

<source>
  @id calico-node.log
  @type tail
  format /^(?<time>[^ ]* [^ ,]*)[^\[]*\[[^\]]*\]\[(?<severity>[^ \]]*) *\] (?<message>.*)$/
  time_format %Y-%m-%d %H:%M:%S
  path /var/log/containers/calico-node**.log
  pos_file /var/log/es-calico.pos
  tag calico-node
</source>

According to regex101.com, this pattern should match this string. However, I get an error from fluentd while trying to parse this

2018-08-14 13:21:20 +0000 [warn]: [calico-node.log] "{\"log\":\"2018-08-14 13:21:20.013 [INFO][67] health.go 150: Overall health summary=\\u0026health.HealthReport{Live:true, Ready:true}\\n\",\"stream\":\"stdout\",\"time\":\"2018-08-14T13:21:20.013908223Z\"}" error="invalid time format: value = {\"log\":\"2018-08-14 13:21:20.013, error_class = ArgumentError, error = string doesn't match"```

What could be wrong? I have had similar errors with the built-in parser for apache logs as well?

-- nithishr
elasticsearch
fluentd
kubernetes
regex

2 Answers

4/8/2020

Just faced a similar issue. I think the @Crou's answer is correct but maybe try %N instead. according to the document, fluentd parser does not support %3N, %6N, %9N, and %L

https://docs.fluentd.org/configuration/parse-section

-- Chih-Cheng Huang
Source: StackOverflow

8/14/2018

From what I can see, you are missing something in the fluentd config.

Your time_format %Y-%m-%d %H:%M:%S will not work with the timestamp 2018-08-14 13:21:20.013, as it's missing .%3N.

It should be as follows: time_format %Y-%m-%d %H:%M:%S.%3N or time_format %Y-%m-%d %H:%M:%S.%L

-- Crou
Source: StackOverflow