I would like to detect using Prometheus (Grafana/alerting) if my containers actual CPU usage is above/under CPU requests and not getting to close to CPU limits?
For memory consumption I managed to do it by doing:
sum by(container_name, pod_name)(container_memory_usage_bytes{namespace=~"myNamespace",pod_name=~"myPodName",container_name=~"myContainerName"})
kube_pod_container_resource_requests_memory_bytes{namespace=~"myNamespace",pod=~"myPodName", container =~"myContainerName"}
kube_pod_container_resource_limits_memory_bytes{namespace=~"myNamespace",pod=~"myPodName", container=~"myContainerName"}
I would to achieve the same with CPU for instance by using: container_cpu_usage_seconds_total
but I do not manage to link it to kube_pod_container_resource_requests_cpu_cores
and not sure this 2 metrics are comparable.
Any suggestions for this?
I use this query to get how many percent of its CPU limits the pod is using.
sum(label_replace(rate(container_cpu_usage_seconds_total{container_name =~ ".+"}[1m]), "pod", "$1", "pod_name", "(.*)")) by (pod, namespace) /
sum(kube_pod_container_resource_limits_cpu_cores{}) by (pod, namespace) * 100
And for RAM:
sum(container_memory_working_set_bytes) by (container_name, namespace) /
sum(label_join(kube_pod_container_resource_limits_memory_bytes, "container_name", "", "container")) by (container_name, namespace) * 100