running v1.10 and i notice that kube-controller-manager
s memory usage spikes and the OOMs all the time. it wouldn't be so bad if the system didn't fall to a crawl before this happens tho.
i tried modifying /etc/kubernetes/manifests/kube-controller-manager.yaml
to have a resource.limits.memory=1Gi
but the kube-controller-manager pod never seems to want to come back up.
any other options?
First of all, you missed information about the amount of memory you use per node.
Second, what do you mean by "system didn't fall to a crawl" - do you mean nodes are swapping?
All Kubernetes masters and nodes are expected to have swap disabled - it's recommended by the Kubernetes community, as mentioned in the Kubernetes documentation.
Support for swap is non-trivial and degrades performance.
Turn off swap on every node by:
sudo swapoff -a
Finally,
resource.limits.memory=1Gi
is default value per pod. These limits are hard limits. Pod reaching this level of allocated memory can cause OOM, even if you have gigabytes of unallocated memory.
There is a bug in kube-controller-manager, and it's fixed in https://github.com/kubernetes/kubernetes/pull/65339