I have a service, deployed into a kubernetes cluster, with fluentd set as a daemon set. And i need to diversify logs it receives so they end up in different s3 buckets. One bucket would be for all logs, generated by kubernetes and our debug/error handling code, and another bucket would be a subset of logs, generated by the service, parsed by structured logger and identified by a specific field in json. Think of it one bucket is for machine state and errors, another is for "user_id created resource image_id at ts" description of user actions
The service itself is ignorant of the fluentd, so i cannot manually set the tag for logs based on which s3 bucket i want them to end in. Now, the fluentd.conf i use sets s3 stuff like this:
<match **>
# docs: https://docs.fluentd.org/v0.12/articles/out_s3
# note: this configuration relies on the nodes have an IAM instance profile with access to your S3 bucket
type copy
<store>
type s3
log_level info
s3_bucket "#{ENV['S3_BUCKET_NAME']}"
s3_region "#{ENV['S3_BUCKET_REGION']}"
aws_key_id "#{ENV['AWS_ACCESS_KEY_ID']}"
aws_sec_key "#{ENV['AWS_SECRET_ACCESS_KEY']}"
s3_object_key_format %{path}%{time_slice}/cluster-log-%{index}.%{file_extension}
format json
time_slice_format %Y/%m/%d
time_slice_wait 1m
flush_interval 10m
utc
include_time_key true
include_tag_key true
buffer_chunk_limit 128m
buffer_path /var/log/fluentd-buffers/s3.buffer
</store>
<store>
...
</store>
</match>
So, what i would like to do is to have something like a grep plugin
<store>
type grep
<regexp>
key type
pattern client-action
</regexp>
</store>
Which would send logs into a separate s3 bucket to the one defined for all logs
I am assuming that user action logs are generated by your service and system logs include docker, kubernetes and systemd logs from the nodes. I found your example yaml file at the official fluent github repo. If you check out the folder in that link, you'll see two more files called kubernetes.conf
and systemd.conf
. These files have got source
sections where they tag
their data.
The match
section in fluent.conf is matching **
, i.e. all logs and sending them to s3. You want to split your log types here. Your container logs are being tagged kubernetes.*
in kubernetes.conf on this line.
so your above config turns into
<match kubernetes.* >
@type s3
# user log s3 bucket
...
and for system logs match
every other tag except kubernetes.*