Grafana for Kubernettes shows CPU usage higher than 100%

4/22/2020

I have 10 Kubernetes nodes (consider them as VMs) which have between 7 and 14 allocatable CPU cores which can be requested by Kubernetes pods. Therefore I'd like to show cluster CPU usage.

This is my current query

sum(kube_pod_container_resource_requests_cpu_cores{node=~"$node"}) / sum(kube_node_status_allocatable_cpu_cores{node=~"$node"})

This query shows strange results, for example over 400%.

I would like to add filter to only calculate this for nodes that have Running pods, since there might be some old node definitions which are not user. I have inherited this setup, so it is not that easy for me to wrap my head around it.

Any suggestions with a query that I can try?

-- Boban
grafana
kubernetes
prometheus
promql

1 Answer

4/22/2020

Your current query is summing up CPU utilization of each nodes so it might show invalid data.

You can check CPU utilization of all pods in the cluster by running:

sum(rate(container_cpu_usage_seconds_total{container_name!="POD",pod_name!=""}[5m]))

If you want to check CPU usage of each running pod you can use using:

sum(rate(container_cpu_usage_seconds_total{container_name!="POD",pod_name!=""}[5m])) by (pod_name)
-- KFC_
Source: StackOverflow