Enabling stackdriver monitoring crashes the metadata-agent pod

1/30/2019

List of pods created when enabling monitoring:

➜ kubectl get pods --namespace=kube-system | grep metadata-agent
NAME                                                READY   STATUS    RESTARTS   AGE
metadata-agent-cluster-level-579ffb7c5f-vm8q8       1/1     Running   908        3d
metadata-agent-gdnb6                                1/1     Running   908        3d
metadata-agent-q7vct                                1/1     Running   885        3d
metadata-agent-rcfl8                                1/1     Running   907        3d
metadata-agent-vvtss                                1/1     Running   908        3d
metadata-agent-zvz6f                                1/1     Running   816        3d

Logs from metadata-agent:

➜ kubectl logs pods/metadata-agent-gdnb6  --namespace=kube-system
I0130 10:32:38 7eff97c7f740 updater.cc:40 Not starting DockerUpdater
I0130 10:32:38 7eff97c7f740 kubernetes.cc:1324 Watching for node-level metadata
I0130 10:32:38 7eff94e58700 kubernetes.cc:1163 Watch thread (pods) started for node gke-rain-rain-node-pool-16891a38-p99s
I0130 10:32:38 7eff8effd700 kubernetes.cc:1203 Watch thread (node) started for node gke-rain-rain-node-pool-16891a38-p99s
I0130 10:32:38 7eff7ffff700 reporter.cc:46 Metadata reporter started
I0130 10:32:41 7eff7ffff700 environment.cc:270 No credentials found at /etc/google/auth/application_default_credentials.json
I0130 10:32:41 7eff7ffff700 environment.cc:146 Got project id from metadata server: 11111111
I0130 10:32:41 7eff7ffff700 oauth2.cc:283 Getting auth token from metadata server
E0130 10:32:41 7eff7ffff700 reporter.cc:64 Metadata request unsuccessful: Server responded with 'Forbidden' (403): Transport endpoint is not connected
E0130 10:33:41 7eff7ffff700 reporter.cc:64 Metadata request unsuccessful: Server responded with 'Forbidden' (403): Transport endpoint is not connected
E0130 10:34:41 7eff7ffff700 reporter.cc:64 Metadata request unsuccessful: Server responded with 'Forbidden' (403): Transport endpoint is not connected
E0130 10:35:41 7eff7ffff700 reporter.cc:64 Metadata request unsuccessful: Server responded with 'Forbidden' (403): Transport endpoint is not connected
E0130 10:36:41 7eff7ffff700 reporter.cc:64 Metadata request unsuccessful: Server responded with 'Forbidden' (403): Transport endpoint is not connected
E0130 10:37:41 7eff7ffff700 reporter.cc:64 Metadata request unsuccessful: Server responded with 'Forbidden' (403): Transport endpoint is not connected

Metadata:

  • GKE 1.11.6-gke.3
  • Enabled stackdriver monitoring via cloud console.

Note:

  • This happens only when enabling stackdriver monitoring after the cluster is created (Not as part of cluster creation).
-- Jaipradeesh
google-cloud-platform
google-kubernetes-engine
kubernetes

1 Answer

1/30/2019

Google Kubernetes Engine by default uses fluentd as the logging agent, while doing a research my thoughts are that you did a manual installation, which according to the Kubernetes monitoring documentation:

Caution: Manual installation on GKE is not recommended. Manual installation was offered to avoid a temporary problem with installing the managed support for Stackdriver Kubernetes Monitoring. That problem has been eliminated. Please see Installing Stackdriver Kubernetes Monitoring to install or upgrade to the most recent version.

My recommendation is to use the default agent to avoid this kind of issues.

-- kornshell93
Source: StackOverflow