Why kubernetes-scheduler and controller-manager is stopped sometime in etcd master (three nodes)

6/13/2017

I had build a master cluster(k8s) with three nodes. But there are two problems:

  1. The etcd's log on every node report two warnings: (1). apply entries took too long [11.167451ms for 1 entries] (2). failed to send out heartbeat on time I probably know it's disk too slow from the google but i can't resolve it

  2. The API server or Kubernetes-sheduler or Controller-Manager that dependency etcd, sometime can't startup or stopped when started(the log probably report the etcd server timeout)

Can you help me?

-- nolan4954
docker
etcd
kubernetes

1 Answer

6/14/2017

Several Kubernetes services such as the kube-controller-manager, kube-apiserver, etc. are tightly integrated with etcd. Slowness or failure in etcd can cause these services to also slow down or even crash.

I'd recommend figuring out the reason for 'etcd' slowness and fixing that. Try using the 'etcdctl' tool for storing and retrieving individual key-value pairs from etcd [1].

Also, if 'etcd' is slowing down because of insufficient memory, try tuning the 'snapshot-count' parameter to lower the number of snapshots stored in memory [2].

[1] https://coreos.com/etcd/docs/latest/getting-started-with-etcd.html

[2] https://coreos.com/etcd/docs/latest/tuning.html#snapshot-tuning

-- Shri Javadekar
Source: StackOverflow