Duplicate metrics with multiple instances of kube-state-metrics

1/14/2020

Problem:

Duplicate data when querying from prometheus for metrics from kube-state-metrics.

Sample query and result with 3 instances of kube-state-metrics running:

Query:

kube_pod_container_resource_requests_cpu_cores{namespace="ns-dummy"}

Metrics

kube_pod_container_resource_requests_cpu_cores{container="appname",endpoint="http",instance="172.232.35.142:8080",job="kube-state-metrics",namespace="ns-dummy",node="ip-172-232-34-25.ec2.internal",pod="app1-appname-6bd9d8d978-gfk7f",service="prom-kube-state-metrics"}
1
kube_pod_container_resource_requests_cpu_cores{container="appname",endpoint="http",instance="172.232.35.142:8080",job="kube-state-metrics",namespace="ns-dummy",node="ip-172-232-35-22.ec2.internal",pod="app2-appname-ccbdfc7c8-g9x6s",service="prom-kube-state-metrics"}
1
kube_pod_container_resource_requests_cpu_cores{container="appname",endpoint="http",instance="172.232.35.17:8080",job="kube-state-metrics",namespace="ns-dummy",node="ip-172-232-34-25.ec2.internal",pod="app1-appname-6bd9d8d978-gfk7f",service="prom-kube-state-metrics"}
1
kube_pod_container_resource_requests_cpu_cores{container="appname",endpoint="http",instance="172.232.35.17:8080",job="kube-state-metrics",namespace="ns-dummy",node="ip-172-232-35-22.ec2.internal",pod="app2-appname-ccbdfc7c8-g9x6s",service="prom-kube-state-metrics"}
1
kube_pod_container_resource_requests_cpu_cores{container="appname",endpoint="http",instance="172.232.37.171:8080",job="kube-state-metrics",namespace="ns-dummy",node="ip-172-232-34-25.ec2.internal",pod="app1-appname-6bd9d8d978-gfk7f",service="prom-kube-state-metrics"}
1
kube_pod_container_resource_requests_cpu_cores{container="appname",endpoint="http",instance="172.232.37.171:8080",job="kube-state-metrics",namespace="ns-dummy",node="ip-172-232-35-22.ec2.internal",pod="app2-appname-ccbdfc7c8-g9x6s",service="prom-kube-state-metrics"}

Observation:

Every metric is coming up Nx when N pods are running for kube-state-metrics. If it's a single pod running, we get the correct info.

Possible solutions:

  1. Scale down to single instance of kube-state-metrics. (Reduced availability is a concern)
  2. Enable sharding. (Solves duplication problem, still less available)

According to the docs, for horizontal scaling we have to pass sharding arguments to the pods.

Shards are zero indexed. So we have to pass the index and total number of shards for each pod.

We are using Helm chart and it is deployed as a deployment.

Questions:

  1. How can we pass different arguments to different pods in this scenario, if its possible?
  2. Should we be worried about availability of the kube-state-metrics considering the self-healing nature of k8s workloads?
  3. When should we really scale it to multiple instances and how?
-- shintocv
devops
kube-state-metrics
kubernetes
monitoring
prometheus

1 Answer

1/21/2020

You could use a 'self-healing' deployment with only a single replica of kube-state-metric if the container down, the deployment will start a new container. Since kube-state-metric is not focused on the health of the individual kubernetes components. It only will affect you if your cluster is too big and make many objects changes per second.

It is not focused on the health of the individual Kubernetes components, but rather on the health of the various objects inside, such as deployments, nodes and pods.

For small cluster there's is no problem use in this way, but you really need a high availability monitoring platform I recommend you take a look in this two articles: creating a well designed and highly available monitoring stack for kubernetes and kubernetes monitoring

-- KoopaKiller
Source: StackOverflow