GCP Kubernetes cluster monitoring limit graph

9/13/2019

enter image description here

This is my Kubernetes cluster node monitoring. K8s cluster is running on GKE and using the stack driver monitoring and logging.

Cluster size is 4vCPU and 15GB memory. In the CPU graph why there is spike above the limit of CPU ? as my cluster CPU is 4vCPU but the limit spike is there.

There is no cluster auto scaler, Node auto scaler, Vertical auto scaler nothing is running.

Same question For memory?

enter image description here

Total size is 15 GB but capacity is 15.77 GB and allocatable 13 GB mean 2 GB is for the Kubernetes system.

For perfect monitoring, I have installed the Default Kubernetes dashboard

enter image description here

Which shows usage is around 10.2GB so I have still 2-3 GB of RAM? As allocatable is 13 GB system taken 2 GB? Am I right?

I have also installed Grafana also

enter image description here

This shows 450 MB Free ram I have imported this dashboard.

But if it's using around 10GB of RAM then out of 13 GB I should have 2-3 GB remaining.

Update :

Kubectl describe node <node>


Resource           Requests         Limits
  --------           --------         ------
  cpu                3073m (78%)      5990m (152%)
  memory             7414160Ki (58%)  12386704Ki (97%)

If you have a look at the first graph of stackdriver as usage increase of RAM limit increase tp 15GB but while allocation or usable memory is 13GB only. How ?

-- Harsh Manvar
google-cloud-platform
kubernetes
stackdriver

2 Answers

9/13/2019

So generally, I think the machines have the ability to go briefly over the specified CPU, which is called burst.

It very well can be that the GKE Dashboard, Kubernetes Dashboard and Grafana use

  • different sources for these metrics
  • different units

Example: In your google summary, it shows you have 15.77 GB. Well, this is not wrong. The machine is probably specified as 15GB. BUT internally, google calculates not with GB, MB, B or something. It calculates with kibibyte. When you run kubectl describe nodes <node>. You get the actual value in kibibyte.

e.g.: for me, it was 15399364 Ki which equals to 15.768948736 GB

One last thing, generally the Google Cloud Console is not very accurate in displaying such information. I would always advice you to get the metrics via command line.

-- rootdoot
Source: StackOverflow

9/13/2019

In your case you have 2 questions, one related to CPU usage and other to Memory Usage:

You put a limited information and the CPU and memory usage depends on different aspects, such as pods, number of nodes, etc.

You put you aren’t using autoscaler for nodes.

This page for Stackdriver Monitoring, you can see the part of containers and for the CPU graph uses “container/cpu/usage_time” where explain “Cumulative CPU usage on all cores in seconds. This number divided by the elapsed time represents usage as a number of cores, regardless of any core limit that might be set. Sampled every 60 seconds”.

In the same page and talking about memory you can read about this graph use “container/memory/bytes_used” where tells “Memory usage in bytes, broken down by type: evictable and non-evictable. Sampled every 60 seconds. memory_type: Either evictable or non-evictable. Evictable memory is memory that can be easily reclaimed by the kernel, while non-evictable memory cannot.”, in this case is using non-evictable.

In your question about the size which the system is allocatable in the case of the memory, it depends on the size you put for cluster works.

for example I proceed to create a cluster with 1 vCPU and 4Gb of memory and the memory allocatable is 2.77Gb.

-- Alfredo F.
Source: StackOverflow