Fluentd time field handling in json log records

12/6/2019

I have standard EFK stack setup in my GKE cluster. I did switch my test gunicorn service to log in json format. To be able to search service logs by json field values in kibana. I see all logs properly output in pods console using command:

kubectl logs pod/dev-django-rest-service-678964c7f5-kh4k4 -n django-rest-service

{"bla": "bla-value", "REFERRER": null, "HOST": "api.myhost.com", "req_id": "4e845383a6ddbc2781919420b1ecb4fe", "message": "Test message", "time": "2019-12-06T11:25:21.635692"}

But kibana does not show these messages. I do not see any parsing error reporting by fluentd pods. There was all messages showing up in kibana, before I switched service message format to json. I also did add some handler to my service just to output some text (not JSON) to pods console. And it is showing correctly in my kibana.

SAY: Hello Word!
{"bla": "bla-value", "REFERRER": null, "HOST": "api.myhost.com", "req_id": "c7123f4419656c9effd40410aac43e51", "message": "Test message", "time": "2019-12-06T19:09:09.378297"}

SAY: Hello Word! - is properly shown in kibana, but the json message is not. Please help to find why json messages ignored.

------edit------

I did create filter

    <filter kubernetes.var.log.containers.dev-django-rest-service**.log>
      @type stdout
    </filter>

and found my json log records in fluentd pods console, but with epoch time.

1970-01-01 00:33:39.993485000 +0000 kubernetes.var.log.containers.dev-django-rest-service-5cf648b874-qj264_django-rest-service_kc2-micros-36e63553f54ecd1ee4cbd6503e19c56a4ff62b98152b7f0e20983384f4b0cf52.log: {"stream":"stdout","docker":{"container_id":"36e63553f54ecd1ee4cbd6503e19c56a4ff62b98152b7f0e20983384f4b0cf52"},"kubernetes":{"container_name":"kc2-micros","namespace_name":"django-rest-service","pod_name":"dev-django-rest-service-5cf648b874-qj264","container_image":"registry.gitlab.com/xxxx/project-templates/django_rest_service/dev:56048ac6eb463ed132a103a052274fb050748f1b","container_image_id":"docker-pullable://registry.gitlab.com/xxxx/project-templates/django_rest_service/dev@sha256:61776222330dacb9ed6b6175e87b6e6626945aa1e7eec87344638be530a2230b","pod_id":"9bdf8bc6-1866-11ea-994f-42010a800002","host":"gke-kc2-small-default-pool-b37c22e6-x5hv","labels":{"app":"dev-django-rest-service","pod-template-hash":"5cf648b874","release":"dev-django-rest-service","tier":"web","track":"stable"},"master_url":"https://10.0.0.1:443/api","namespace_id":"6a6b1eaf-ea8b-11e9-9ec8-42010a80020e"},"bla":"bla-value","REFERRER":null,"HOST":"api.my-host.com","req_id":"bc642dd8c8ae0ed59fa02632dc9567dd","message":"Test message"}
-- Testobile Testossimo
fluentd
json
kibana
kubernetes
time

1 Answer

12/9/2019

Actually, when the json log record contains time field and it has incorrect format, fluentd does replace the log record time with the epoch start time 1970-01-01. So when you press "Last 15 minutes" log filter in kibana you will never see these records. I have not found any mentions about this in fluentd docs. It was discovered experimental way using stdout filter. I use this fluentd helm package https://github.com/kiwigrid/helm-charts/blob/master/charts/fluentd-elasticsearch/. When I changed time format here https://github.com/kiwigrid/helm-charts/blob/master/charts/fluentd-elasticsearch/templates/configmaps.yaml#L175 so section look:

      <parse>
        @type multi_format
        <pattern>
          format json
          time_key time
          time_format %Y-%m-%dT%H:%M:%S.%N %Z
        </pattern>
        <pattern>
          format none
        </pattern>
      </parse>

And my corresponding gunicorn time format was:

record['time']=timezone.now().strftime('%Y-%m-%dT%H:%M:%S.%f000 %Z')

It starts to show correct time from my gunicorn log records.

-- Testobile Testossimo
Source: StackOverflow