etcd DB cluster on kubernetes misbehaving

7/21/2020

In my project we have etcd DB deployed on Kubernetes (this etcd is for application use, separate from the Kubernetes etcd) on on-prem. So I deployed it using the bitnami helm chart as a statefulset. Initially, at the time of deployment, the number of replicas was 1 as we wanted a single instance of etcd DB earlier.

The real problem started when we scaled it up to 3. I updated configuration to scale it up by updating the ETCD_INITIAL_CLUSTER with two new members DNS name:

etcd-0=http://etcd-0.etcd-headless.wallet.svc.cluster.local:2380,etcd-1=http://etcd-1.etcd-headless.wallet.svc.cluster.local:2380,etcd-2=http://etcd-2.etcd-headless.wallet.svc.cluster.local:2380

Now when I go inside any of etcd pod and run etcdctl member list I only get a list of member and none of them is selected as leader, which is wrong. One among three should be the leader.

Also after running for some time these pods start giving heartbeat exceeds error and server overload error:

W |  etcdserver: failed to send out heartbeat on time (exceeded the 950ms timeout for 593.648512ms, to a9b7b8c4e027337a
W | etcdserver: server is likely overloaded
W | wal: sync duration of 2.575790761s, expected less than 1s

I changed the heartbeat default value accordingly, the number of errors decreased but still, I get a few heartbeat exceed errors along with others.

Not sure what is the problem here, is it the i/o that's causing the problem? If yes I am not sure how to be sure.

Will really appreciate any help on this.

-- Mr Kashyap
database
etcd
kubernetes
kubernetes-helm
kubernetes-pod

2 Answers

7/30/2020

Okay, I had two problems:

  • "failed to send out heartbeat" Warning messages.

  • "No leader election".

Next day i found out the reason of second problem, actually i had startup parameter set in the pod definition. ETCDCTL_API: 3

so when i run "etcdctl member list" with APIv3 it doesn't mention which member is selected as reader.

$ ETCDCTL_API=3 etcdctl member list
    
    3d0bc1a46f81ecd9, started, etcd-2, http://etcd-2.etcd-headless.wallet.svc.cluster.local:2380, http://etcd-2.etcd-headless.wallet.svc.cluster.local:2379, false
    b6a5d762d566708b, started, etcd-1, http://etcd-1.etcd-headless.wallet.svc.cluster.local:2380, http://etcd-1.etcd-headless.wallet.svc.cluster.local:2379, false


$ ETCDCTL_API=2 etcdctl member list
    
    3d0bc1a46f81ecd9, started, etcd-2, http://etcd-2.etcd-headless.wallet.svc.cluster.local:2380, http://etcd-2.etcd-headless.wallet.svc.cluster.local:2379, false
    b6a5d762d566708b, started, etcd-1, http://etcd-1.etcd-headless.wallet.svc.cluster.local:2380, http://etcd-1.etcd-headless.wallet.svc.cluster.local:2379, true

So when i use APIv2 i can see which node is elected as leader and there were no problem with leader election. Still working on heartbeat warning but i guess i need to tune the config in order to avoied that.

NB: I have 3 nodes, stopped one for testing.

-- Mr Kashyap
Source: StackOverflow

7/21/2020

I don't think 🤔 the heartbeats are the main problem, it also seems 👀 the logs that you are seeing are Warning logs. So it's possible that some heartbeats are missed here and there but your nodes are node(s) are not crashing or mirroring.

It's likely that you changed the replica numbers and your new replicas are not joining the cluster. So, I would recommend following this guide for you to add the new members to the cluster. Basically with etcdctl something like this:

etcdctl member add node2 --peer-urls=http://node1:2380
etcdctl member add node3 --peer-urls=http://node1:2380,http://node2:2380

Note that you will have to run these commands in a pod that has access to all your etcd nodes in your cluster.

You could also consider managing your etcd cluster with the etcd operator 🔧 which should be able to take care of the scaling and removal/addition of nodes.

✌️

-- Rico
Source: StackOverflow