metrics-server:v0.4.2 cannot scrape metrics inside AWS kubernetes cluster environment (cannot validate certificate, doesn't contain any IP SANs)

4/20/2021

Situation: The metrics-server deployment image is: k8s.gcr.io/metrics-server/metrics-server:v0.4.2 I have used kops tool to deploy a kubernetes cluster into one AWS account.

The error and reason why it is failing, fetched by kubectl -n kube-system logs metrics-server-bcc948649-dsnd6

 unable to fully scrape metrics: [unable to fully scrape metrics from node ip-10-33-47-106.eu-central-1.compute.internal: unable to fetch metrics from node ip-10-33-47-106.eu-central-1.compute.internal: Get "https://10.33.47.106:10250/stats/summary?only_cpu_and_memory=true": x509: cannot validate certificate for 10.33.47.106 because it doesn't contain any IP SANs, unable to fully scrape metrics from node ip-10-33-50-109.eu-central-1.compute.internal: unable to fetch metrics from node ip-10-33-50-109.eu-central-1.compute.internal: Get "https://10.33.50.109:10250/stats/summary?only_cpu_and_memory=true": x509: cannot validate certificate for 10.33.50.109 because it doesn't contain any IP SANs]

I can solve this easy by modifying the metrics-server deployment template, and adding the argument - --kubelet-insecure-tls to the container args, but does not seem production solution.

What I want to ask and learn here is, how can I resolve this in the proper way without losing security ?

-- Kristi Jorgji
amazon-web-services
kubernetes
metrics-server

1 Answer

8/17/2021

Kubelet certificates created by kOps contain only node hostname among their SANs, while metrics server deployed with default manifest is trying to use node private IPs for scraping. Changing the kubelet-preferred-address-types argument resolves this issue:

- --kubelet-preferred-address-types=Hostname
-- Stanislav Vitkovskiy
Source: StackOverflow