I'm using Fluentd with Elasticsearch for logs from Kubernetes but I noticed that some JSON logs cannot be correctly indexed because JSON is stored as string.
Logs from kubectl logs look like:
{"timestamp":"2016-11-03T15:48:12.007Z","level":"INFO","thread":"cromwell-system-akka.actor.default-dispatcher-4","logger":"akka.event.slf4j.Slf4jLogger","message":"Slf4jLogger started","context":"default"}
But logs saved in file in /var/log/containers/... have escaped quotes and makes them string instead of JSON which spoil indexing:
{"log":"{\"timestamp\":\"2016-11-03T15:45:07.976Z\",\"level\":\"INFO\",\"thread\":\"cromwell-system-akka.actor.default-dispatcher-4\",\"logger\":\"akka.event.slf4j.Slf4jLogger\",\"message\":\"Slf4jLogger started\",\"context\":\"default\"}\n","stream":"stdout","time":"2016-11-03T15:45:07.995443479Z"}
I'm trying to get logs looking like:
{
"log": {
"timestamp": "2016-11-03T15:45:07.976Z",
"level": "INFO",
"thread": "cromwell-system-akka.actor.default-dispatcher-4",
"logger": "akka.event.slf4j.Slf4jLogger",
"message": "Slf4jLogger started",
"context": "default"
},
"stream": "stdout",
"time": "2016-11-03T15: 45: 07.995443479Z"
}
Can you suggest me how to do it?
I ran into the same issue, however I'm using fluent-bit
, the "C" version of fluentd
(Ruby). Since this is an older issue, I'm answering for the benefit of others who find this.
In fluent-bit
v0.13, they addressed this issue. You can now specify the parser to use through annotations. The parser can be configured to decode the log as json.
The final parser with decoder looks like this:
[PARSER]
Name embedded-json
Format json
Time_Key time
Time_Format %Y-%m-%dT%H:%M:%S.%L
Time_Keep On
# Command | Decoder | Field | Optional Action
# =============|==================|=======|=========
Decode_Field_As escaped log do_next
Decode_Field_As json log