Unable to fully collect metrics, when installing metric-server

11/20/2018

I have installed the metric server on kubernetes, but its not working and logs

unable to fully collect metrics: [unable to fully scrape metrics from source kubelet_summary:xxx: unable to fetch metrics from Kubelet ... (X.X): Get https:....: x509: cannot validate certificate for 1x.x.

x509: certificate signed by unknown authority

I was able to get metrics if modified the deployment yaml and added

 command:
        - /metrics-server
        - --kubelet-insecure-tls
        - --kubelet-preferred-address-types=InternalIP

this now collects metrics, and kubectl top node returns results...

but logs still show

 E1120 11:58:45.624974       1 reststorage.go:144] unable to fetch pod metrics for pod dev/pod-6bffbb9769-6z6qz: no metrics known for pod
E1120 11:58:45.625289       1 reststorage.go:144] unable to fetch pod metrics for pod dev/pod-6bffbb9769-rzvfj: no metrics known for pod
E1120 12:00:06.462505       1 manager.go:102] unable to fully collect metrics: [unable to fully scrape metrics from source kubelet_summary:ip-1x.x.x.eu-west-1.compute.internal: unable to get CPU for container ...discarding data: missing cpu usage metric, unable to fully scrape metrics from source

so questions

1) All this works on minikube, but not on my dev cluster, why would that be?

2) In production i dont want to do insecure-tls.. so can someone please explain why this issue is arising... or point me to some resource.

-- user1555190
kubernetes

1 Answer

11/20/2018

Kubeadm generates the kubelet certificate at /var/lib/kubelet/pki and those certificates (kubelet.crt and kubelet.key) are signed by different CA from the one which is used to generate all other certificates at /etc/kubelet/pki.

You need to regenerate the kubelet certificates which is signed by your root CA (/etc/kubernetes/pki/ca.crt)

You can use openssl or cfssl to generate the new certificates(I am using cfssl)

$ mkdir certs; cd certs
$ cp /etc/kubernetes/pki/ca.crt ca.pem
$ cp /etc/kubernetes/pki/ca.key ca-key.pem

Create a file kubelet-csr.json:

{
  "CN": "kubernetes",
  "hosts": [
    "127.0.0.1",
    "<node_name>",
    "kubernetes",
    "kubernetes.default",
    "kubernetes.default.svc",
    "kubernetes.default.svc.cluster",
    "kubernetes.default.svc.cluster.local"
  ],
  "key": {
    "algo": "rsa",
    "size": 2048
  },
  "names": [{
    "C": "US",
    "ST": "NY",
    "L": "City",
    "O": "Org",
    "OU": "Unit"
  }]
}

Create a ca-config.json file:

{
  "signing": {
    "default": {
      "expiry": "8760h"
    },
    "profiles": {
      "kubernetes": {
        "usages": [
          "signing",
          "key encipherment",
          "server auth",
          "client auth"
        ],
        "expiry": "8760h"
      }
    }
  }
}

Now generate the new certificates using above files:

$ cfssl gencert -ca=ca.pem -ca-key=ca-key.pem \
--config=ca-config.json -profile=kubernetes \
kubelet-csr.json | cfssljson -bare kubelet

Replace the old certificates with newly generated one:

$ scp kubelet.pem <nodeip>:/var/lib/kubelet/pki/kubelet.crt
$ scp kubelet-key.pem <nodeip>:/var/lib/kubelet/pki/kubelet.key

Now restart the kubelet so that new certificates will take effect on your node.

$ systemctl restart kubelet

Look at the following tickets to get the context of issue:

https://github.com/kubernetes-incubator/metrics-server/issues/146

Hope this helps.

-- Prafull Ladha
Source: StackOverflow