how to extract fields like severity, time, message etc from kubernetes logs using fluentd

9/6/2019

How can I extract different fields like severity, time, stream, message from kubernetes logs using fluentd. I have multiple patters of logs coming from kubernetes. how can i extract fields from different kinds of patterns using fluentd

sample error log - {"log":"2019-09-06 21:57:50.864 DEBUG 1 --- [ntainer#2-0-C-1] class : error message \"\n","stream":"stdout","time":"2019-09-06T21:57:50.864955417Z"}

regex I am using - ^(?:{"log":"(?<time>\d+(?:-\d+){2}\s+\d+(?::\d+){2}\.\d+)\s"*(?<level>\S+) (?<pid>\d+) --- \[(?<thread>[\s\S]*?)\] (?<class>\S+)\s*:\s* (?<message>[\s\S]*?)(?=\g<time>|\Z).*)

I am getting pattern not found error while using regex. below is my fluentd config

 <source> 
   type tail 
   path /var/log/containers/*.log
   pos_file /var/log/es-containers.log.pos
   tag kubernetes.* 
   format /^(?:{"log":"(?<time>\d+(?:-\d+){2}\s+\d+(?::\d+){2}\.\d+)\s"*(?<level>\S+) (?<pid>\d+) --- \[(?<thread>[\s\S]*?)\] (?<class>\S+)\s*:\s* (?<message>[\s\S]*?)(?=\g<time>|\Z).*)/
   time_format %Y-%m-%dT%H:%M:%S.%NZ
   read_from_head false
 </source>
-- user2416
elasticsearch
fluentd
kubernetes

1 Answer

9/9/2019

Well I have been through this. I found playing with regex pretty unhealthy :).

This is how I solved it .

  1. I made the application to write the log as a json. In my case it is a java application using log4j. So could easily convert it to a json.
  2. So application logs began looking like {"level":"INFO","timestamp":"2019-09-06 11:33:24,987","thread":"http-nio-8090-exec-5","file":"AccountService.java","line":"51","message":"A sample info log","trace":""}
  3. Now the kubernetes logs wraps the application logs with log object as a json.
  4. With a bit of help from the blog, I could get all the information as attributes in elastic search.
-- arunvg
Source: StackOverflow