FluentD fails flוshing to ElasticSearch (K8S, AWS)

3/15/2018

I've got a k8s cluster (v1.9) running against AWS. There's an elasticsearch up and running, working fine.

For some reason, the fluentD logs do not arrive to the elasticsearch.

the fluentD Daemon Set:

{
  "kind": "DaemonSet",
  "apiVersion": "extensions/v1beta1",
  "metadata": {
    "name": "fluentd-es-v2.0.3",
    "namespace": "kube-system",
    "selfLink": "/apis/extensions/v1beta1/namespaces/kube-system/daemonsets/fluentd-es-v2.0.3",
    "uid": "f0a23779-fba7-11e7-a9be-12d5302c43be",
    "resourceVersion": "10549372",
    "generation": 1,
    "creationTimestamp": "2018-01-17T17:00:37Z",
    "labels": {
      "addonmanager.kubernetes.io/mode": "Reconcile",
      "k8s-app": "fluentd-es",
      "kubernetes.io/cluster-service": "true",
      "version": "v2.0.3"
    }
  },
  "spec": {
    "selector": {
      "matchLabels": {
        "k8s-app": "fluentd-es",
        "version": "v2.0.3"
      }
    },
    "template": {
      "metadata": {
        "creationTimestamp": null,
        "labels": {
          "k8s-app": "fluentd-es",
          "kubernetes.io/cluster-service": "true",
          "version": "v2.0.3"
        },
        "annotations": {
          "scheduler.alpha.kubernetes.io/critical-pod": ""
        }
      },
      "spec": {
        "volumes": [
          {
            "name": "varlog",
            "hostPath": {
              "path": "/var/log",
              "type": ""
            }
          },
          {
            "name": "varlibdockercontainers",
            "hostPath": {
              "path": "/var/lib/docker/containers",
              "type": ""
            }
          },
          {
            "name": "libsystemddir",
            "hostPath": {
              "path": "/usr/lib64",
              "type": ""
            }
          },
          {
            "name": "config-volume",
            "configMap": {
              "name": "fluentd-es-config-v0.1.2",
              "defaultMode": 420
            }
          }
        ],
        "containers": [
          {
            "name": "fluentd-es",
            "image": "gcr.io/google-containers/fluentd-elasticsearch:v2.0.3",
            "env": [
              {
                "name": "FLUENTD_ARGS",
                "value": "--no-supervisor -q"
              }
            ],
            "resources": {
              "limits": {
                "memory": "500Mi"
              },
              "requests": {
                "cpu": "100m",
                "memory": "200Mi"
              }
            },
            "volumeMounts": [
              {
                "name": "varlog",
                "mountPath": "/var/log"
              },
              {
                "name": "varlibdockercontainers",
                "readOnly": true,
                "mountPath": "/var/lib/docker/containers"
              },
              {
                "name": "libsystemddir",
                "readOnly": true,
                "mountPath": "/host/lib"
              },
              {
                "name": "config-volume",
                "mountPath": "/etc/fluent/config.d"
              }
            ],
            "livenessProbe": {
              "exec": {
                "command": [
                  "/bin/sh",
                  "-c",
                  "LIVENESS_THRESHOLD_SECONDS=${LIVENESS_THRESHOLD_SECONDS:-300}; STUCK_THRESHOLD_SECONDS=${LIVENESS_THRESHOLD_SECONDS:-900}; if [ ! -e /var/log/fluentd-buffers ]; then\n  exit 1;\nfi; LAST_MODIFIED_DATE=`stat /var/log/fluentd-buffers | grep Modify | sed -r \"s/Modify: (.*)/\\1/\"`; LAST_MODIFIED_TIMESTAMP=`date -d \"$LAST_MODIFIED_DATE\" +%s`; if [ `date +%s` -gt `expr $LAST_MODIFIED_TIMESTAMP + $STUCK_THRESHOLD_SECONDS` ]; then\n  rm -rf /var/log/fluentd-buffers;\n  exit 1;\nfi; if [ `date +%s` -gt `expr $LAST_MODIFIED_TIMESTAMP + $LIVENESS_THRESHOLD_SECONDS` ]; then\n  exit 1;\nfi;\n"
                ]
              },
              "initialDelaySeconds": 600,
              "timeoutSeconds": 1,
              "periodSeconds": 60,
              "successThreshold": 1,
              "failureThreshold": 3
            },
            "terminationMessagePath": "/dev/termination-log",
            "terminationMessagePolicy": "File",
            "imagePullPolicy": "IfNotPresent"
          }
        ],
        "restartPolicy": "Always",
        "terminationGracePeriodSeconds": 30,
        "dnsPolicy": "ClusterFirst",
        "nodeSelector": {
          "beta.kubernetes.io/arch": "amd64"
        },
        "securityContext": {},
        "schedulerName": "default-scheduler"
      }
    },
    "updateStrategy": {
      "type": "RollingUpdate",
      "rollingUpdate": {
        "maxUnavailable": 1
      }
    },
    "templateGeneration": 1,
    "revisionHistoryLimit": 10
  },
  "status": {
    "currentNumberScheduled": 2,
    "numberMisscheduled": 0,
    "desiredNumberScheduled": 2,
    "numberReady": 2,
    "observedGeneration": 1,
    "updatedNumberScheduled": 2,
    "numberAvailable": 2
  }
}

when looking in the pod's log, there seems to be an error:

  2018-03-15 11:20:09 +0000 [warn]: /var/lib/gems/2.3.0/gems/elasticsearch-transport-6.0.0/lib/elasticsearch/transport/transport/http/faraday.rb:20:in `perform_request'
  2018-03-15 11:20:09 +0000 [warn]: /var/lib/gems/2.3.0/gems/elasticsearch-transport-6.0.0/lib/elasticsearch/transport/client.rb:131:in `perform_request'
  2018-03-15 11:20:09 +0000 [warn]: /var/lib/gems/2.3.0/gems/elasticsearch-api-6.0.0/lib/elasticsearch/api/actions/ping.rb:20:in `ping'
  2018-03-15 11:20:09 +0000 [warn]: /var/lib/gems/2.3.0/gems/fluent-plugin-elasticsearch-1.9.7/lib/fluent/plugin/out_elasticsearch.rb:163:in `client'
  2018-03-15 11:20:09 +0000 [warn]: /var/lib/gems/2.3.0/gems/fluent-plugin-elasticsearch-1.9.7/lib/fluent/plugin/out_elasticsearch.rb:364:in `rescue in send_bulk'
  2018-03-15 11:20:09 +0000 [warn]: /var/lib/gems/2.3.0/gems/fluent-plugin-elasticsearch-1.9.7/lib/fluent/plugin/out_elasticsearch.rb:359:in `send_bulk'
  2018-03-15 11:20:09 +0000 [warn]: /var/lib/gems/2.3.0/gems/fluent-plugin-elasticsearch-1.9.7/lib/fluent/plugin/out_elasticsearch.rb:346:in `write_objects'
  2018-03-15 11:20:09 +0000 [warn]: /var/lib/gems/2.3.0/gems/fluentd-0.12.42/lib/fluent/output.rb:490:in `write'
  2018-03-15 11:20:09 +0000 [warn]: /var/lib/gems/2.3.0/gems/fluentd-0.12.42/lib/fluent/buffer.rb:354:in `write_chunk'
  2018-03-15 11:20:09 +0000 [warn]: /var/lib/gems/2.3.0/gems/fluentd-0.12.42/lib/fluent/buffer.rb:333:in `pop'
  2018-03-15 11:20:09 +0000 [warn]: /var/lib/gems/2.3.0/gems/fluentd-0.12.42/lib/fluent/output.rb:342:in `try_flush'
  2018-03-15 11:20:09 +0000 [warn]: /var/lib/gems/2.3.0/gems/fluentd-0.12.42/lib/fluent/output.rb:149:in `run'
2018-03-15 11:20:09 +0000 [warn]: temporarily failed to flush the buffer. next_retry=2018-03-15 11:20:10 +0000 error_class="Elasticsearch::Transport::Transport::Errors::Forbidden" error="[403] " plugin_id="object:3f7e3ab9f4b8"
  2018-03-15 11:20:09 +0000 [warn]: /var/lib/gems/2.3.0/gems/elasticsearch-transport-6.0.0/lib/elasticsearch/transport/transport/base.rb:202:in `__raise_transport_error'
  2018-03-15 11:20:09 +0000 [warn]: /var/lib/gems/2.3.0/gems/elasticsearch-transport-6.0.0/lib/elasticsearch/transport/transport/base.rb:319:in `perform_request'
  2018-03-15 11:20:09 +0000 [warn]: /var/lib/gems/2.3.0/gems/elasticsearch-transport-6.0.0/lib/elasticsearch/transport/transport/http/faraday.rb:20:in `perform_request'
  2018-03-15 11:20:09 +0000 [warn]: /var/lib/gems/2.3.0/gems/elasticsearch-transport-6.0.0/lib/elasticsearch/transport/client.rb:131:in `perform_request'
  2018-03-15 11:20:09 +0000 [warn]: /var/lib/gems/2.3.0/gems/elasticsearch-api-6.0.0/lib/elasticsearch/api/actions/ping.rb:20:in `ping'
  2018-03-15 11:20:09 +0000 [warn]: /var/lib/gems/2.3.0/gems/fluent-plugin-elasticsearch-1.9.7/lib/fluent/plugin/out_elasticsearch.rb:163:in `client'
  2018-03-15 11:20:09 +0000 [warn]: /var/lib/gems/2.3.0/gems/fluent-plugin-elasticsearch-1.9.7/lib/fluent/plugin/out_elasticsearch.rb:364:in `rescue in send_bulk'
  2018-03-15 11:20:09 +0000 [warn]: /var/lib/gems/2.3.0/gems/fluent-plugin-elasticsearch-1.9.7/lib/fluent/plugin/out_elasticsearch.rb:359:in `send_bulk'
  2018-03-15 11:20:09 +0000 [warn]: /var/lib/gems/2.3.0/gems/fluent-plugin-elasticsearch-1.9.7/lib/fluent/plugin/out_elasticsearch.rb:346:in `write_objects'
  2018-03-15 11:20:09 +0000 [warn]: /var/lib/gems/2.3.0/gems/fluentd-0.12.42/lib/fluent/output.rb:490:in `write'
  2018-03-15 11:20:09 +0000 [warn]: /var/lib/gems/2.3.0/gems/fluentd-0.12.42/lib/fluent/buffer.rb:354:in `write_chunk'
  2018-03-15 11:20:09 +0000 [warn]: /var/lib/gems/2.3.0/gems/fluentd-0.12.42/lib/fluent/buffer.rb:333:in `pop'
  2018-03-15 11:20:09 +0000 [warn]: /var/lib/gems/2.3.0/gems/fluentd-0.12.42/lib/fluent/output.rb:342:in `try_flush'
  2018-03-15 11:20:09 +0000 [warn]: /var/lib/gems/2.3.0/gems/fluentd-0.12.42/lib/fluent/output.rb:149:in `run'
2018-03-15 11:20:10 +0000 [warn]: temporarily failed to flush the buffer. next_retry=2018-03-15 11:20:12 +0000 error_class="Elasticsearch::Transport::Transport::Errors::Forbidden" error="[403] " plugin_id="object:3f7e3ab9f4b8"
  2018-03-15 11:20:10 +0000 [warn]: suppressed same stacktrace
2018-03-15 11:20:12 +0000 [warn]: temporarily failed to flush the buffer. next_retry=2018-03-15 11:20:16 +0000 error_class="Elasticsearch::Transport::Transport::Errors::Forbidden" error="[403] " plugin_id="object:3f7e3ab9f4b8"
  2018-03-15 11:20:12 +0000 [warn]: suppressed same stacktrace
2018-03-15 11:20:16 +0000 [warn]: temporarily failed to flush the buffer. next_retry=2018-03-15 11:20:25 +0000 error_class="Elasticsearch::Transport::Transport::Errors::Forbidden" error="[403] " plugin_id="object:3f7e3ab9f4b8"
  2018-03-15 11:20:16 +0000 [warn]: suppressed same stacktrace
2018-03-15 11:20:25 +0000 [warn]: temporarily failed to flush the buffer. next_retry=2018-03-15 11:20:39 +0000 error_class="Elasticsearch::Transport::Transport::Errors::Forbidden" error="[403] " plugin_id="object:3f7e3ab9f4b8"
  2018-03-15 11:20:25 +0000 [warn]: suppressed same stacktrace
2018-03-15 11:20:39 +0000 [warn]: temporarily failed to flush the buffer. next_retry=2018-03-15 11:21:08 +0000 error_class="Elasticsearch::Transport::Transport::Errors::Forbidden" error="[403] " plugin_id="object:3f7e3ab9f4b8"
  2018-03-15 11:20:39 +0000 [warn]: suppressed same stacktrace
2018-03-15 11:21:08 +0000 [warn]: temporarily failed to flush the buffer. next_retry=2018-03-15 11:21:38 +0000 error_class="Elasticsearch::Transport::Transport::Errors::Forbidden" error="[403] " plugin_id="object:3f7e3ab9f4b8"
  2018-03-15 11:21:08 +0000 [warn]: suppressed same stacktrace
2018-03-15 11:21:38 +0000 [warn]: temporarily failed to flush the buffer. next_retry=2018-03-15 11:22:08 +0000 error_class="Elasticsearch::Transport::Transport::Errors::Forbidden" error="[403] " plugin_id="object:3f7e3ab9f4b8"
  2018-03-15 11:21:38 +0000 [warn]: suppressed same stacktrace
2018-03-15 11:22:08 +0000 [warn]: temporarily failed to flush the buffer. next_retry=2018-03-15 11:22:38 +0000 error_class="Elasticsearch::Transport::Transport::Errors::Forbidden" error="[403] " plugin_id="object:3f7e3ab9f4b8"
  2018-03-15 11:22:08 +0000 [warn]: suppressed same stacktrace
2018-03-15 11:22:38 +0000 [warn]: temporarily failed to flush the buffer. next_retry=2018-03-15 11:23:08 +0000 error_class="Elasticsearch::Transport::Transport::Errors::Forbidden" error="[403] " plugin_id="object:3f7e3ab9f4b8"
  2018-03-15 11:22:38 +0000 [warn]: suppressed same stacktrace
2018-03-15 11:23:08 +0000 [warn]: temporarily failed to flush the buffer. next_retry=2018-03-15 11:23:38 +0000 error_class="Elasticsearch::Transport::Transport::Errors::Forbidden" error="[403] " plugin_id="object:3f7e3ab9f4b8"
  2018-03-15 11:23:08 +0000 [warn]: suppressed same stacktrace
2018-03-15 11:23:38 +0000 [warn]: temporarily failed to flush the buffer. next_retry=2018-03-15 11:24:08 +0000 error_class="Elasticsearch::Transport::Transport::Errors::Forbidden" error="[403] " plugin_id="object:3f7e3ab9f4b8"
  2018-03-15 11:23:38 +0000 [warn]: suppressed same stacktrace

I can't understand what id causing the "temporarily failed to flush the buffer. error_class="Elasticsearch::Transport::Transport::Errors::Forbidden" error="[403] " problem and could not get around fixing it.

help?

-- Amit Liber
elasticsearch
fluentd
kubernetes

2 Answers

6/27/2018

I had a similar problem. I wrote up a detailed response to another StackOverflow question, so I won't duplicate it here.

I'm using a different fluentd docker image and daemonset configuration than you are. Here are the settings I used in the <match **> section of my fluentd configuration file:

  • The host field should not include the https:// at the beginning of the URL.
  • Set the port to 443.
  • Set the scheme to https.
  • There's no user name or password for the AWS ES service, so these fields should not be in the configuration. See this discussion.
-- RobotNerd
Source: StackOverflow

3/15/2018

Looks like you don't have permissions to write to Elasticsearch from your FluentD daemon. That can be because of wrong IAM role on nodes where fluentd working, as example.

So, you have a several way to fix it:

  1. You can add a right role to your nodes where Kuberentes running following this AWS guide.
  2. Also, you can assign right IAM role directly to your pod using Kube2iam.
  3. That is less secure, but you can allow access to ES from your Kubernetes nodes in VPC by it's IP addresses. Here is guide about ip-based policies.
-- Anton Kostenko
Source: StackOverflow