Difference between VM CPU usage and GKE container CPU usage

12/23/2019

I have a cluster of 2 nodes, each node is a VM of 2 CPU on GCE

Here is the chart for VM CPU usage metric VM CPU

Here is the chart for CPU usage from GKE containers

GKE CPU

So why is there much difference between 2 metric? Also why total CPU usage of GKE can be higher than 4 seconds (because I have 4 cores ) Cluster nodes

PS1 : I found that there is a "bug" or something is not perfect with the chart in stackdriver monitoring. When I change the chart to be 1w then I get something like this 1w chart And if I use 1d chart then it looks like 1d chart

So now I only have one question left, why total CPU usage from GKE containers are higher than number of cores?

-- OnlineBiz Software
google-kubernetes-engine

2 Answers

1/7/2020

I found that the GKE container CPU usage were not quite correct, we should filter out the container name podsgke...... and container without a name then the chart seems to match with VM CPU usage. I guess these are not a part of workload.

-- OnlineBiz Software
Source: StackOverflow

12/23/2019

GCE will measure the overall CPU usage of a VM which includes all processes being run (containers, daemons, OS overhead, etc) whereas the GKE container metric only looks at specific container metrics. The container is a single process.

Also, the metric value you are looking at is not utilization; utilization is measured as a percentage, not in seconds as per the stackdriver metrics reference page. The graph you are looking at shows seconds on the right-hand side, but the important value is the one on the left of the graph which should be a percentage.

The utilization is a percentage of CPU used Vs the total CPU available. At the GCE level this means the CPU used by all processes of the OS Vs the total CPU allotted (2 CPUs). For the container, this is the CPU used by the container process Vs the CPU allocated by k8s. The sum of the containers will not result in the same value as that of the VM, and it is possible for the container CPU utilization to go over 100%

-- Patrick W
Source: StackOverflow