Last night my Kubernetes cluster on GKE was upgraded to 1.16.8-gke.9. Since then the logs show error: unable to find container named fluentd-gcp
every minute. Logging from my applications still works, but I'd like to know what causes this error and how to get rid of this.
Expanding the error yields slightly more details:
{
"textPayload": "error: unable to find container named fluentd-gcp\n",
"insertId": "v1b2u2ldrnswujhz2",
"resource": {
"type": "k8s_container",
"labels": {
"project_id": "foo",
"pod_name": "fluentd-gke-scaler-cd4d654d7-tgg27",
"cluster_name": "foo-cluster",
"container_name": "fluentd-gke-scaler",
"namespace_name": "kube-system",
"location": "us-east1-d"
}
},
"timestamp": "2020-04-24T16:15:40.224944500Z",
"severity": "ERROR",
"labels": {
"gke.googleapis.com/log_type": "system",
"k8s-pod/k8s-app": "fluentd-gke-scaler",
"k8s-pod/pod-template-hash": "cd4d654d7"
},
"logName": "projects/foo/logs/stderr",
"receiveTimestamp": "2020-04-24T16:15:45.923960735Z"
}
kubectl get all --all-namespaces
shows fluentd-gke
pods with a fluentd-gke
container, not fluentd-gcp
.
Any advice would be appreciated and I'm happy to post more details, if you tell me where to look for them.
Edit: More details and related problems on the GKE issue tracker: https://issuetracker.google.com/issues/156965162
1.16.8-gke.9 is currently being offered through rapid channel. Keep in mind that such a channel is offered on an early access basis for people to test new releases, as such the version offered may be subject to unresolved issues with no known workaround. That said a possible fix could be to drain and migrate your workloads to another node. If issue persists, then create an issue here.