Fluentd create a tag based on a field value

9/5/2017

I have a Kubernetes cluster in which i'm trying to aggregate container logs on the nodes and send them to MongoDB. However i need to be able to send the log records to different MongoDB servers based on values in the log record it self.

I'm using the fluent-plugin-kubernetes_metadata_filter plugin to attach additional information from Kubernetes to the log record. One of those fields are kubernetes_namespace_name. Is it possible to use that field to create a tag which i can use to match against the mongodb output plugin.

For example. Below i'm using only one output, but the idea is to have multiple and let fluent send the logs to that mongodb database based on the value in the field kubernetes_namespace_name:

<source>
  @type tail
  @label @KUBERNETES
  path /var/log/containers/*.log
  pos_file /var/log/es-containers.log.pos
  time_format %Y-%m-%dT%H:%M:%S
  tag kubernetes.*
  format json
  keep_time_key true
  read_from_head true
</source>

<label @KUBERNETES>
  <filter kubernetes.**>
    @type kubernetes_metadata
    kubernetes_url "#{ENV['K8S_HOST_URL']}"
    bearer_token_file /var/run/secrets/kubernetes.io/serviceaccount/token
    ca_file /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
    include_namespace_id true
  </filter>
  <filter kubernetes.**>
    @type flatten_hash
    separator _
  </filter>

  # < Tag 'kubernetes.namespace.default' is created here somehow >

  <match kubernetes.namespace.default>
    @type mongo
    host "#{ENV['MONGO_HOST']}"
    port "#{ENV['MONGO_PORT']}"
    database "#{ENV['MONGO_DATABASE']}"
    collection "#{ENV['MONGO_COLLECTION']}"
    capped
    capped_size 1024m
    user "#{ENV['MONGO_USER']}"
    password "#{ENV['MONGO_PASSWORD']}"
    time_key time
    flush_interval 10s
  </match>
</label>
-- amofasser
fluentd
kubernetes
mongodb

1 Answer

9/5/2017

instead of using the tag, you can use the message content to do the filtering using Fluentd's grep filter. You can add the filter after the kubernetes meta data filter, and before the data flattener. This allows you to specify the key kubernetes_namespace_name and then route according to the value within. As you may have additional MongoDB outputs using labels can help separate the process workflows.

Documentation: https://docs.fluentd.org/v0.12/articles/filter_grep

Example:

<filter kubernetes.**>
  @type grep
  <regexp>
    key kubernetes_namespace_name
    pattern cool
  </regexp>
</filter>

<YOUR MONGO CONFIG HERE>
-- Anurag Gupta
Source: StackOverflow