I have installed the metric server on kubernetes, but its not working and logs
unable to fully collect metrics: [unable to fully scrape metrics from source kubelet_summary:xxx: unable to fetch metrics from Kubelet ... (X.X): Get https:....: x509: cannot validate certificate for 1x.x.
x509: certificate signed by unknown authority
I was able to get metrics if modified the deployment yaml and added
command:
- /metrics-server
- --kubelet-insecure-tls
- --kubelet-preferred-address-types=InternalIP
this now collects metrics, and kubectl top node returns results...
but logs still show
E1120 11:58:45.624974 1 reststorage.go:144] unable to fetch pod metrics for pod dev/pod-6bffbb9769-6z6qz: no metrics known for pod
E1120 11:58:45.625289 1 reststorage.go:144] unable to fetch pod metrics for pod dev/pod-6bffbb9769-rzvfj: no metrics known for pod
E1120 12:00:06.462505 1 manager.go:102] unable to fully collect metrics: [unable to fully scrape metrics from source kubelet_summary:ip-1x.x.x.eu-west-1.compute.internal: unable to get CPU for container ...discarding data: missing cpu usage metric, unable to fully scrape metrics from source
so questions
1) All this works on minikube, but not on my dev cluster, why would that be?
2) In production i dont want to do insecure-tls.. so can someone please explain why this issue is arising... or point me to some resource.
Kubeadm generates the kubelet certificate at /var/lib/kubelet/pki
and those certificates (kubelet.crt and kubelet.key
) are signed by different CA from the one which is used to generate all other certificates at /etc/kubelet/pki
.
You need to regenerate the kubelet certificates which is signed by your root CA (/etc/kubernetes/pki/ca.crt
)
You can use openssl or cfssl to generate the new certificates(I am using cfssl)
$ mkdir certs; cd certs
$ cp /etc/kubernetes/pki/ca.crt ca.pem
$ cp /etc/kubernetes/pki/ca.key ca-key.pem
Create a file kubelet-csr.json
:
{
"CN": "kubernetes",
"hosts": [
"127.0.0.1",
"<node_name>",
"kubernetes",
"kubernetes.default",
"kubernetes.default.svc",
"kubernetes.default.svc.cluster",
"kubernetes.default.svc.cluster.local"
],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [{
"C": "US",
"ST": "NY",
"L": "City",
"O": "Org",
"OU": "Unit"
}]
}
Create a ca-config.json file:
{
"signing": {
"default": {
"expiry": "8760h"
},
"profiles": {
"kubernetes": {
"usages": [
"signing",
"key encipherment",
"server auth",
"client auth"
],
"expiry": "8760h"
}
}
}
}
Now generate the new certificates using above files:
$ cfssl gencert -ca=ca.pem -ca-key=ca-key.pem \
--config=ca-config.json -profile=kubernetes \
kubelet-csr.json | cfssljson -bare kubelet
Replace the old certificates with newly generated one:
$ scp kubelet.pem <nodeip>:/var/lib/kubelet/pki/kubelet.crt
$ scp kubelet-key.pem <nodeip>:/var/lib/kubelet/pki/kubelet.key
Now restart the kubelet so that new certificates will take effect on your node.
$ systemctl restart kubelet
Look at the following tickets to get the context of issue:
https://github.com/kubernetes-incubator/metrics-server/issues/146
Hope this helps.