Detect with Prometheus if I am not getting to close to Kubernetes container CPU limits

4/19/2019

I would like to detect using Prometheus (Grafana/alerting) if my containers actual CPU usage is above/under CPU requests and not getting to close to CPU limits?

For memory consumption I managed to do it by doing:

sum by(container_name, pod_name)(container_memory_usage_bytes{namespace=~"myNamespace",pod_name=~"myPodName",container_name=~"myContainerName"})
kube_pod_container_resource_requests_memory_bytes{namespace=~"myNamespace",pod=~"myPodName", container =~"myContainerName"}
kube_pod_container_resource_limits_memory_bytes{namespace=~"myNamespace",pod=~"myPodName", container=~"myContainerName"}

I would to achieve the same with CPU for instance by using: container_cpu_usage_seconds_total but I do not manage to link it to kube_pod_container_resource_requests_cpu_cores and not sure this 2 metrics are comparable.

Any suggestions for this?

-- scoulomb
grafana
kubernetes
openshift
prometheus

1 Answer

4/19/2019

I use this query to get how many percent of its CPU limits the pod is using.

sum(label_replace(rate(container_cpu_usage_seconds_total{container_name =~ ".+"}[1m]), "pod", "$1", "pod_name", "(.*)")) by (pod, namespace) /
sum(kube_pod_container_resource_limits_cpu_cores{}) by (pod, namespace) * 100

And for RAM:

sum(container_memory_working_set_bytes) by (container_name, namespace) / 
sum(label_join(kube_pod_container_resource_limits_memory_bytes, "container_name", "", "container")) by (container_name, namespace) * 100
-- Vasily Angapov
Source: StackOverflow