Error 403 Forbidden to metrics server connect with nodes

4/16/2021

In my cluster, i have the metrics-server v0.3.6 running. Recently, i installed the version v0.4.2, but the pod of the new version is presenting this error to connect to the nodes: "403 Forbidden"

I0416 18:02:22.965541       1 secure_serving.go:197] Serving securely on [::]:4443
I0416 18:02:22.965652       1 requestheader_controller.go:169] Starting RequestHeaderAuthRequestController
I0416 18:02:22.965662       1 shared_informer.go:240] Waiting for caches to sync for RequestHeaderAuthRequestController
I0416 18:02:22.965698       1 dynamic_serving_content.go:130] Starting serving-cert::/tmp/apiserver.crt::/tmp/apiserver.key
I0416 18:02:22.965721       1 tlsconfig.go:240] Starting DynamicServingCertificateController
I0416 18:02:22.965814       1 configmap_cafile_content.go:202] Starting client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0416 18:02:22.965826       1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0416 18:02:22.965841       1 configmap_cafile_content.go:202] Starting client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0416 18:02:22.965851       1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
E0416 18:02:22.982587       1 server.go:132] unable to fully scrape metrics: [unable to fully scrape metrics from node $NODE_NAME1: unable to fetch metrics from node $NODE_NAME1: request failed - "403 Forbidden"., unable to fully scrape metrics from node $NODE_NAME2: unable to fetch metrics from node $NODE_NAME2: request failed - "403 Forbidden"., unable to fully scrape metrics from node $NODE_NAME3: unable to fetch metrics from node $NODE_NAME3: request failed - "403 Forbidden".]
I0416 18:02:23.065801       1 shared_informer.go:247] Caches are synced for RequestHeaderAuthRequestController 
I0416 18:02:23.065896       1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file 
I0416 18:02:23.066144       1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file 

I use the Kubernetes in version 1.17.17-gke.2800

I already tried to use --kubelet-insecure-tls in the flags, but it did not work

Thank you for the help!

-- jackson alves
google-kubernetes-engine
kubernetes

1 Answer

4/19/2021

Based on the information you have provided, seeing that there are issues with some of the new pods that have spinned up, you can try to see if migrating your workloads (i.e. cordoned and drain nodes) to a new node pool to see if the issues still persist.

Instructions on how to cordoned and drain your existing node pool can be found here: 1. https://kubernetes.io/docs/tasks/administer-cluster/safely-drain-node/ 2. https://cloud.google.com/kubernetes-engine/docs/tutorials/migrating-node-pool#step_4_migrate_the_workloads

-- Anthony Leo
Source: StackOverflow