I have a Kubernetes pod consisting of two containers - main app (writes logs to file on volume) and Fluentd sidecar that tails log file and writes to Elasticsearch.
Here is the Fluentd configuration:
<source>
type tail
format none
path /test/log/system.log
pos_file /test/log/system.log.pos
tag anm
</source>
<match **>
@id elasticsearch
@type elasticsearch
@log_level debug
time_key @timestamp
include_timestamp true
include_tag_key true
host elasticsearch-logging.kube-system.svc.cluster.local
port 9200
logstash_format true
<buffer>
@type file
path /var/log/fluentd-buffers/kubernetes.system.buffer
flush_mode interval
retry_type exponential_backoff
flush_thread_count 2
flush_interval 5s
retry_forever
retry_max_interval 30
chunk_limit_size 2M
queue_limit_length 8
overflow_action block
</buffer>
</match>
Everything is working, Elasticsearch host & port are correct since API works correctly on that URL. In Kibana I see only records every 5 seconds about Fluentd creating new chunk:
2018-12-03 12:15:50 +0000 [debug]: #0 [elasticsearch] Created new chunk chunk_id="57c1d1c105bcc60d2e2e671dfa5bef04" metadata=#<struct Fluent::Plugin::Buffer::Metadata timekey=nil, tag="anm", variables=nil>
but no actual logs in Kibana (the ones that are being written by the app to system.log file). Kibana is configured to the "logstash-*" index pattern that matches the one and only existing index.
Version of Fluentd image: k8s.gcr.io/fluentd-elasticsearch:v2.0.4
Version of Elasticsearch: k8s.gcr.io/elasticsearch:v6.3.0
Where can I check to find out what's wrong? Looks like Fluentd does not get to put the logs into Elasticsearch, but what can be the reason?
The answer turned out to be embarrassingly simple, maybe will help someone in the future.
I figured the problem was with this source config line:
<source>
...
format none
...
</source>
That meant that no usual tags where added when saved to elasticsearch (e.g. pod or container name) and I had to search for these records in Kibana in a completely different way. For instance, I used my own tag to search for those records and found them alright. The custom tag was originally added just in case, but turned out to be very useful:
<source>
...
tag anm
...
</source>
So, the final takeaway could be the following. Use "format none" with caution, and if the source data actually is unstructured, add your own tags, and possibly enrich with additional tags/info (e.g. "hostname", etc) using fluentd's record_transformer, which I ended up also doing. Then it will be much easier to locate the records via Kibana.