Kubernetes etcd not coming up

8/14/2018

I have a cluster setup with 1 master and 2 nodes. My goal is to add another master(lets call it master2) for HA.

Following the guide on https://kubernetes.io/docs/setup/independent/high-availability/, when I run the command on master2

  kubectl exec -n kube-system etcd-${CP0_HOSTNAME} -- etcdctl --ca-file /etc/kubernetes/pki/etcd/ca.crt --cert-file /etc/kubernetes/pki/etcd/peer.crt --key-file /etc/kubernetes/pki/etcd/peer.key --endpoints=https://${CP0_IP}:2379 member add ${CP1_HOSTNAME} https://${CP1_IP}:2380

Message on master2

Added member named kashif-test-master-2043226 with ID b5885b66a1abce99 to cluster

master 1 now hangs

kubectl get pods -n kube-system
No resources found.
Error from server: grpc: the client connection is closing
(Cluster is screwed up because etcd cant come up and hence the api server)

etcd logs on master 1 display that master2 is successfully added to etcd cluster. However, the master 1 etcd hangs and keeps trying to connect to master2 etcd and leader election keeps failing. The only error I see is

 2018-08-15 00:17:33.140527 E | etcdserver: publish error: etcdserver: request timed out
2018-08-15 00:17:33.150406 W | rafthttp: health check for peer b5885b66a1abce99 could not connect: dial tcp 10.148.179.221:2380: getsockopt: connection refused

Has anyone experienced this and how to fix it?

Complete logs for docker container running etcd --advertise

  2018-08-15 00:16:22.664296 I | etcdmain: etcd Version: 3.2.18
2018-08-15 00:16:22.664384 I | etcdmain: Git SHA: eddf599c6
2018-08-15 00:16:22.664387 I | etcdmain: Go Version: go1.8.7
2018-08-15 00:16:22.664398 I | etcdmain: Go OS/Arch: linux/amd64
2018-08-15 00:16:22.664401 I | etcdmain: setting maximum number of CPUs to 8, total number of available CPUs is 8
2018-08-15 00:16:22.664445 N | etcdmain: the server is already initialized as member before, starting as etcd member...
2018-08-15 00:16:22.664464 I | embed: peerTLS: cert = /etc/kubernetes/pki/etcd/peer.crt, key = /etc/kubernetes/pki/etcd/peer.key, ca = , trusted-ca = /etc/kubernetes/pki/etcd/ca.crt, client-cert-auth = true
2018-08-15 00:16:22.665053 I | embed: listening for peers on https://10.148.217.160:2380
2018-08-15 00:16:22.665104 I | embed: listening for client requests on 10.148.217.160:2379
2018-08-15 00:16:22.665123 I | embed: listening for client requests on 127.0.0.1:2379
2018-08-15 00:16:22.667636 I | etcdserver: recovered store from snapshot at index 20002
2018-08-15 00:16:22.812523 I | mvcc: restore compact to 18857
2018-08-15 00:16:22.822923 I | etcdserver: name = kashif-test-master-3-2056620
2018-08-15 00:16:22.822943 I | etcdserver: data dir = /var/lib/etcd
2018-08-15 00:16:22.822949 I | etcdserver: member dir = /var/lib/etcd/member
2018-08-15 00:16:22.822952 I | etcdserver: heartbeat = 100ms
2018-08-15 00:16:22.822956 I | etcdserver: election = 1000ms
2018-08-15 00:16:22.822959 I | etcdserver: snapshot count = 10000
2018-08-15 00:16:22.822970 I | etcdserver: advertise client URLs = https://10.148.217.160:2379
2018-08-15 00:16:22.917535 I | etcdserver: restarting member e24703670a9a48b7 in cluster f71ca33e79557019 at commit index 22743
2018-08-15 00:16:22.917715 I | raft: e24703670a9a48b7 became follower at term 250
2018-08-15 00:16:22.917729 I | raft: newRaft e24703670a9a48b7 [peers: [e24703670a9a48b7], term: 250, commit: 22743, applied: 20002, lastindex: 22746, lastterm: 2]
2018-08-15 00:16:22.917820 I | etcdserver/api: enabled capabilities for version 3.2
2018-08-15 00:16:22.917831 I | etcdserver/membership: added member e24703670a9a48b7 [https://10.148.217.160:2380] to cluster f71ca33e79557019 from store
2018-08-15 00:16:22.917836 I | etcdserver/membership: set the cluster version to 3.2 from store
2018-08-15 00:16:23.063893 I | mvcc: restore compact to 18857
2018-08-15 00:16:23.072671 W | auth: simple token is not cryptographically signed
2018-08-15 00:16:23.137524 I | etcdserver: starting server... [version: 3.2.18, cluster version: 3.2]
2018-08-15 00:16:23.137694 I | etcdserver: e24703670a9a48b7 as single-node; fast-forwarding 9 ticks (election ticks 10)
2018-08-15 00:16:23.138541 I | embed: ClientTLS: cert = /etc/kubernetes/pki/etcd/server.crt, key = /etc/kubernetes/pki/etcd/server.key, ca = , trusted-ca = /etc/kubernetes/pki/etcd/ca.crt, client-cert-auth = true
2018-08-15 00:16:23.144881 I | etcdserver/membership: added member b5885b66a1abce99 [https://10.148.179.221:2380] to cluster f71ca33e79557019
2018-08-15 00:16:23.144904 I | rafthttp: starting peer b5885b66a1abce99...
2018-08-15 00:16:23.144931 I | rafthttp: started HTTP pipelining with peer b5885b66a1abce99
2018-08-15 00:16:23.146017 I | rafthttp: started streaming with peer b5885b66a1abce99 (writer)
2018-08-15 00:16:23.146791 I | rafthttp: started streaming with peer b5885b66a1abce99 (writer)
2018-08-15 00:16:23.147130 I | rafthttp: started peer b5885b66a1abce99
2018-08-15 00:16:23.147148 I | rafthttp: added peer b5885b66a1abce99
2018-08-15 00:16:23.147165 I | rafthttp: started streaming with peer b5885b66a1abce99 (stream MsgApp v2 reader)
2018-08-15 00:16:23.147183 I | rafthttp: started streaming with peer b5885b66a1abce99 (stream Message reader)
2018-08-15 00:16:23.818163 I | raft: e24703670a9a48b7 is starting a new election at term 250
2018-08-15 00:16:23.818218 I | raft: e24703670a9a48b7 became candidate at term 251
2018-08-15 00:16:23.818232 I | raft: e24703670a9a48b7 received MsgVoteResp from e24703670a9a48b7 at term 251
2018-08-15 00:16:23.818240 I | raft: e24703670a9a48b7 [logterm: 2, index: 22746] sent MsgVote request to b5885b66a1abce99 at term 251
2018-08-15 00:16:25.218218 I | raft: e24703670a9a48b7 is starting a new election at term 251
2018-08-15 00:16:25.218253 I | raft: e24703670a9a48b7 became candidate at term 252
2018-08-15 00:16:25.218264 I | raft: e24703670a9a48b7 received MsgVoteResp from e24703670a9a48b7 at term 252
2018-08-15 00:16:25.218274 I | raft: e24703670a9a48b7 [logterm: 2, index: 22746] sent MsgVote request to b5885b66a1abce99 at term 252
2018-08-15 00:16:26.318208 I | raft: e24703670a9a48b7 is starting a new election at term 252
2018-08-15 00:16:26.318259 I | raft: e24703670a9a48b7 became candidate at term 253
2018-08-15 00:16:26.318269 I | raft: e24703670a9a48b7 received MsgVoteResp from e24703670a9a48b7 at term 253
2018-08-15 00:16:26.318279 I | raft: e24703670a9a48b7 [logterm: 2, index: 22746] sent MsgVote request to b5885b66a1abce99 at term 253
2018-08-15 00:16:27.718234 I | raft: e24703670a9a48b7 is starting a new election at term 253
2018-08-15 00:16:27.718285 I | raft: e24703670a9a48b7 became candidate at term 254
2018-08-15 00:16:27.718318 I | raft: e24703670a9a48b7 received MsgVoteResp from e24703670a9a48b7 at term 254
2018-08-15 00:16:27.718326 I | raft: e24703670a9a48b7 [logterm: 2, index: 22746] sent MsgVote request to b5885b66a1abce99 at term 254
2018-08-15 00:16:28.147299 W | rafthttp: health check for peer b5885b66a1abce99 could not connect: dial tcp 10.148.179.221:2380: getsockopt: connection refused
2018-08-15 00:16:28.918081 I | raft: e24703670a9a48b7 is starting a new election at term 254
2018-08-15 00:16:28.918117 I | raft: e24703670a9a48b7 became candidate at term 255
2018-08-15 00:16:28.918129 I | raft: e24703670a9a48b7 received MsgVoteResp from e24703670a9a48b7 at term 255
2018-08-15 00:16:28.918140 I | raft: e24703670a9a48b7 [logterm: 2, index: 22746] sent MsgVote request to b5885b66a1abce99 at term 255
2018-08-15 00:16:29.918192 I | raft: e24703670a9a48b7 is starting a new election at term 255
2018-08-15 00:16:29.918227 I | raft: e24703670a9a48b7 became candidate at term 256
2018-08-15 00:16:29.918237 I | raft: e24703670a9a48b7 received MsgVoteResp from e24703670a9a48b7 at term 256
2018-08-15 00:16:29.918253 I | raft: e24703670a9a48b7 [logterm: 2, index: 22746] sent MsgVote request to b5885b66a1abce99 at term 256
2018-08-15 00:16:30.137940 E | etcdserver: publish error: etcdserver: request timed out
2018-08-15 00:16:31.618238 I | raft: e24703670a9a48b7 is starting a new election at term 256
2018-08-15 00:16:31.618268 I | raft: e24703670a9a48b7 became candidate at term 257
2018-08-15 00:16:31.618277 I | raft: e24703670a9a48b7 received MsgVoteResp from e24703670a9a48b7 at term 257
2018-08-15 00:16:31.618294 I | raft: e24703670a9a48b7 [logterm: 2, index: 22746] sent MsgVote request to b5885b66a1abce99 at term 257
2018-08-15 00:16:32.618266 I | raft: e24703670a9a48b7 is starting a new election at term 257
2018-08-15 00:16:32.618339 I | raft: e24703670a9a48b7 became candidate at term 258
2018-08-15 00:16:32.618365 I | raft: e24703670a9a48b7 received MsgVoteResp from e24703670a9a48b7 at term 258
2018-08-15 00:16:32.618374 I | raft: e24703670a9a48b7 [logterm: 2, index: 22746] sent MsgVote request to b5885b66a1abce99 at term 258
2018-08-15 00:16:33.147791 W | rafthttp: health check for peer b5885b66a1abce99 could not connect: dial tcp 10.148.179.221:2380: getsockopt: connection refused
2018-08-15 00:16:34.018129 I | raft: e24703670a9a48b7 is starting a new election at term 258
2018-08-15 00:16:34.018165 I | raft: e24703670a9a48b7 became candidate at term 259
2018-08-15 00:16:34.018174 I | raft: e24703670a9a48b7 received MsgVoteResp from e24703670a9a48b7 at term 259
2018-08-15 00:16:34.018183 I | raft: e24703670a9a48b7 [logterm: 2, index: 22746] sent MsgVote request to b5885b66a1abce99 at term 259
2018-08-15 00:16:35.018201 I | raft: e24703670a9a48b7 is starting a new election at term 259
2018-08-15 00:16:35.018251 I | raft: e24703670a9a48b7 became candidate at term 260
2018-08-15 00:16:35.018267 I | raft: e24703670a9a48b7 received MsgVoteResp from e24703670a9a48b7 at term 260
2018-08-15 00:16:35.018280 I | raft: e24703670a9a48b7 [logterm: 2, index: 22746] sent MsgVote request to b5885b66a1abce99 at term 260
2018-08-15 00:16:36.518353 I | raft: e24703670a9a48b7 is starting a new election at term 260
2018-08-15 00:16:36.518413 I | raft: e24703670a9a48b7 became candidate at term 261
2018-08-15 00:16:36.518433 I | raft: e24703670a9a48b7 received MsgVoteResp from e24703670a9a48b7 at term 261
2018-08-15 00:16:36.518450 I | raft: e24703670a9a48b7 [logterm: 2, index: 22746] sent MsgVote request to b5885b66a1abce99 at term 261
2018-08-15 00:16:37.138147 E | etcdserver: publish error: etcdserver: request timed out
2018-08-15 00:16:37.718086 I | raft: e24703670a9a48b7 is starting a new election at term 261
2018-08-15 00:16:37.718135 I | raft: e24703670a9a48b7 became candidate at term 262
2018-08-15 00:16:37.718163 I | raft: e24703670a9a48b7 received MsgVoteResp from e24703670a9a48b7 at term 262
2018-08-15 00:16:37.718171 I | raft: e24703670a9a48b7 [logterm: 2, index: 22746] sent MsgVote request to b5885b66a1abce99 at term 262
2018-08-15 00:16:38.148059 W | rafthttp: health check for peer b5885b66a1abce99 could not connect: dial tcp 10.148.179.221:2380: getsockopt: connection refused
2018-08-15 00:16:39.118282 I | raft: e24703670a9a48b7 is starting a new election at term 262
2018-08-15 00:16:39.118352 I | raft: e24703670a9a48b7 became candidate at term 263
2018-08-15 00:16:39.118363 I | raft: e24703670a9a48b7 received MsgVoteResp from e24703670a9a48b7 at term 263
2018-08-15 00:16:39.118372 I | raft: e24703670a9a48b7 [logterm: 2, index: 22746] sent MsgVote request to b5885b66a1abce99 at term 263
2018-08-15 00:16:41.018051 I | raft: e24703670a9a48b7 is starting a new election at term 263
2018-08-15 00:16:41.018110 I | raft: e24703670a9a48b7 became candidate at term 264
2018-08-15 00:16:41.018137 I | raft: e24703670a9a48b7 received MsgVoteResp from e24703670a9a48b7 at term 264
2018-08-15 00:16:41.018170 I | raft: e24703670a9a48b7 [logterm: 2, index: 22746] sent MsgVote request to b5885b66a1abce99 at term 264
2018-08-15 00:16:42.518067 I | raft: e24703670a9a48b7 is starting a new election at term 264
2018-08-15 00:16:42.518096 I | raft: e24703670a9a48b7 became candidate at term 265
2018-08-15 00:16:42.518117 I | raft: e24703670a9a48b7 received MsgVoteResp from e24703670a9a48b7 at term 265
2018-08-15 00:16:42.518125 I | raft: e24703670a9a48b7 [logterm: 2, index: 22746] sent MsgVote request to b5885b66a1abce99 at term 265
2018-08-15 00:16:43.148209 W | rafthttp: health check for peer b5885b66a1abce99 could not connect: dial tcp 10.148.179.221:2380: getsockopt: connection refused
2018-08-15 00:16:43.518078 I | raft: e24703670a9a48b7 is starting a new election at term 265
2018-08-15 00:16:43.518103 I | raft: e24703670a9a48b7 became candidate at term 266
2018-08-15 00:16:43.518123 I | raft: e24703670a9a48b7 received MsgVoteResp from e24703670a9a48b7 at term 266
2018-08-15 00:16:43.518131 I | raft: e24703670a9a48b7 [logterm: 2, index: 22746] sent MsgVote request to b5885b66a1abce99 at term 266
2018-08-15 00:16:44.138472 E | etcdserver: publish error: etcdserver: request timed out
2018-08-15 00:16:45.418193 I | raft: e24703670a9a48b7 is starting a new election at term 266
2018-08-15 00:16:45.418255 I | raft: e24703670a9a48b7 became candidate at term 267
2018-08-15 00:16:45.418270 I | raft: e24703670a9a48b7 received MsgVoteResp from e24703670a9a48b7 at term 267
2018-08-15 00:16:45.418278 I | raft: e24703670a9a48b7 [logterm: 2, index: 22746] sent MsgVote request to b5885b66a1abce99 at term 267
2018-08-15 00:16:46.718129 I | raft: e24703670a9a48b7 is starting a new election at term 267
2018-08-15 00:16:46.718267 I | raft: e24703670a9a48b7 became candidate at term 268
2018-08-15 00:16:46.718289 I | raft: e24703670a9a48b7 received MsgVoteResp from e24703670a9a48b7 at term 268
2018-08-15 00:16:46.718309 I | raft: e24703670a9a48b7 [logterm: 2, index: 22746] sent MsgVote request to b5885b66a1abce99 at term 268
2018-08-15 00:16:47.918096 I | raft: e24703670a9a48b7 is starting a new election at term 268
2018-08-15 00:16:47.918145 I | raft: e24703670a9a48b7 became candidate at term 269
2018-08-15 00:16:47.918167 I | raft: e24703670a9a48b7 received MsgVoteResp from e24703670a9a48b7 at term 269
2018-08-15 00:16:47.918185 I | raft: e24703670a9a48b7 [logterm: 2, index: 22746] sent MsgVote request to b5885b66a1abce99 at term 269
2018-08-15 00:16:48.148337 W | rafthttp: health check for peer b5885b66a1abce99 could not connect: dial tcp 10.148.179.221:2380: getsockopt: connection refused
2018-08-15 00:16:49.118189 I | raft: e24703670a9a48b7 is starting a new election at term 269
2018-08-15 00:16:49.118224 I | raft: e24703670a9a48b7 became candidate at term 270
2018-08-15 00:16:49.118236 I | raft: e24703670a9a48b7 received MsgVoteResp from e24703670a9a48b7 at term 270
2018-08-15 00:16:49.118244 I | raft: e24703670a9a48b7 [logterm: 2, index: 22746] sent MsgVote request to b5885b66a1abce99 at term 270
2018-08-15 00:16:50.318136 I | raft: e24703670a9a48b7 is starting a new election at term 270
2018-08-15 00:16:50.318183 I | raft: e24703670a9a48b7 became candidate at term 271
2018-08-15 00:16:50.318193 I | raft: e24703670a9a48b7 received MsgVoteResp from e24703670a9a48b7 at term 271
2018-08-15 00:16:50.318201 I | raft: e24703670a9a48b7 [logterm: 2, index: 22746] sent MsgVote request to b5885b66a1abce99 at term 271
2018-08-15 00:16:51.138655 E | etcdserver: publish error: etcdserver: request timed out
2018-08-15 00:16:52.018257 I | raft: e24703670a9a48b7 is starting a new election at term 271
2018-08-15 00:16:52.018331 I | raft: e24703670a9a48b7 became candidate at term 272
2018-08-15 00:16:52.018352 I | raft: e24703670a9a48b7 received MsgVoteResp from e24703670a9a48b7 at term 272
2018-08-15 00:16:52.018370 I | raft: e24703670a9a48b7 [logterm: 2, index: 22746] sent MsgVote request to b5885b66a1abce99 at term 272
2018-08-15 00:16:53.148561 W | rafthttp: health check for peer b5885b66a1abce99 could not connect: dial tcp 10.148.179.221:2380: getsockopt: connection refused
2018-08-15 00:16:53.418204 I | raft: e24703670a9a48b7 is starting a new election at term 272
2018-08-15 00:16:53.418266 I | raft: e24703670a9a48b7 became candidate at term 273
2018-08-15 00:16:53.418288 I | raft: e24703670a9a48b7 received MsgVoteResp from e24703670a9a48b7 at term 273
2018-08-15 00:16:53.418306 I | raft: e24703670a9a48b7 [logterm: 2, index: 22746] sent MsgVote request to b5885b66a1abce99 at term 273
2018-08-15 00:16:55.318191 I | raft: e24703670a9a48b7 is starting a new election at term 273
2018-08-15 00:16:55.318231 I | raft: e24703670a9a48b7 became candidate at term 274
2018-08-15 00:16:55.318241 I | raft: e24703670a9a48b7 received MsgVoteResp from e24703670a9a48b7 at term 274
2018-08-15 00:16:55.318250 I | raft: e24703670a9a48b7 [logterm: 2, index: 22746] sent MsgVote request to b5885b66a1abce99 at term 274
2018-08-15 00:16:57.218158 I | raft: e24703670a9a48b7 is starting a new election at term 274
2018-08-15 00:16:57.218186 I | raft: e24703670a9a48b7 became candidate at term 275
2018-08-15 00:16:57.218196 I | raft: e24703670a9a48b7 received MsgVoteResp from e24703670a9a48b7 at term 275
2018-08-15 00:16:57.218203 I | raft: e24703670a9a48b7 [logterm: 2, index: 22746] sent MsgVote request to b5885b66a1abce99 at term 275
2018-08-15 00:16:58.138912 E | etcdserver: publish error: etcdserver: request timed out
2018-08-15 00:16:58.148709 W | rafthttp: health check for peer b5885b66a1abce99 could not connect: dial tcp 10.148.179.221:2380: getsockopt: connection refused
2018-08-15 00:16:58.718319 I | raft: e24703670a9a48b7 is starting a new election at term 275
2018-08-15 00:16:58.718395 I | raft: e24703670a9a48b7 became candidate at term 276
2018-08-15 00:16:58.718423 I | raft: e24703670a9a48b7 received MsgVoteResp from e24703670a9a48b7 at term 276
2018-08-15 00:16:58.718448 I | raft: e24703670a9a48b7 [logterm: 2, index: 22746] sent MsgVote request to b5885b66a1abce99 at term 276
2018-08-15 00:16:59.718241 I | raft: e24703670a9a48b7 is starting a new election at term 276
2018-08-15 00:16:59.718298 I | raft: e24703670a9a48b7 became candidate at term 277
2018-08-15 00:16:59.718319 I | raft: e24703670a9a48b7 received MsgVoteResp from e24703670a9a48b7 at term 277
2018-08-15 00:16:59.718337 I | raft: e24703670a9a48b7 [logterm: 2, index: 22746] sent MsgVote request to b5885b66a1abce99 at term 277
2018-08-15 00:17:00.918226 I | raft: e24703670a9a48b7 is starting a new election at term 277
2018-08-15 00:17:00.918294 I | raft: e24703670a9a48b7 became candidate at term 278
2018-08-15 00:17:00.918313 I | raft: e24703670a9a48b7 received MsgVoteResp from e24703670a9a48b7 at term 278
2018-08-15 00:17:00.918331 I | raft: e24703670a9a48b7 [logterm: 2, index: 22746] sent MsgVote request to b5885b66a1abce99 at term 278
2018-08-15 00:17:02.018067 I | raft: e24703670a9a48b7 is starting a new election at term 278
2018-08-15 00:17:02.018098 I | raft: e24703670a9a48b7 became candidate at term 279
2018-08-15 00:17:02.018107 I | raft: e24703670a9a48b7 received MsgVoteResp from e24703670a9a48b7 at term 279
2018-08-15 00:17:02.018115 I | raft: e24703670a9a48b7 [logterm: 2, index: 22746] sent MsgVote request to b5885b66a1abce99 at term 279
2018-08-15 00:17:03.148918 W | rafthttp: health check for peer b5885b66a1abce99 could not connect: dial tcp 10.148.179.221:2380: getsockopt: connection refused
2018-08-15 00:17:03.418161 I | raft: e24703670a9a48b7 is starting a new election at term 279
2018-08-15 00:17:03.418194 I | raft: e24703670a9a48b7 became candidate at term 280
2018-08-15 00:17:03.418205 I | raft: e24703670a9a48b7 received MsgVoteResp from e24703670a9a48b7 at term 280
2018-08-15 00:17:03.418213 I | raft: e24703670a9a48b7 [logterm: 2, index: 22746] sent MsgVote request to b5885b66a1abce99 at term 280
2018-08-15 00:17:05.018117 I | raft: e24703670a9a48b7 is starting a new election at term 280
2018-08-15 00:17:05.018173 I | raft: e24703670a9a48b7 became candidate at term 281
2018-08-15 00:17:05.018190 I | raft: e24703670a9a48b7 received MsgVoteResp from e24703670a9a48b7 at term 281
2018-08-15 00:17:05.018200 I | raft: e24703670a9a48b7 [logterm: 2, index: 22746] sent MsgVote request to b5885b66a1abce99 at term 281
2018-08-15 00:17:05.139346 E | etcdserver: publish error: etcdserver: request timed out
2018-08-15 00:17:06.018223 I | raft: e24703670a9a48b7 is starting a new election at term 281
2018-08-15 00:17:06.018278 I | raft: e24703670a9a48b7 became candidate at term 282
2018-08-15 00:17:06.018297 I | raft: e24703670a9a48b7 received MsgVoteResp from e24703670a9a48b7 at term 282
2018-08-15 00:17:06.018315 I | raft: e24703670a9a48b7 [logterm: 2, index: 22746] sent MsgVote request to b5885b66a1abce99 at term 282
2018-08-15 00:17:07.018834 I | raft: e24703670a9a48b7 is starting a new election at term 282
2018-08-15 00:17:07.018880 I | raft: e24703670a9a48b7 became candidate at term 283
2018-08-15 00:17:07.018891 I | raft: e24703670a9a48b7 received MsgVoteResp from e24703670a9a48b7 at term 283
2018-08-15 00:17:07.018902 I | raft: e24703670a9a48b7 [logterm: 2, index: 22746] sent MsgVote request to b5885b66a1abce99 at term 283
2018-08-15 00:17:08.149114 W | rafthttp: health check for peer b5885b66a1abce99 could not connect: dial tcp 10.148.179.221:2380: getsockopt: connection refused
2018-08-15 00:17:08.818304 I | raft: e24703670a9a48b7 is starting a new election at term 283
2018-08-15 00:17:08.818340 I | raft: e24703670a9a48b7 became candidate at term 284
2018-08-15 00:17:08.818354 I | raft: e24703670a9a48b7 received MsgVoteResp from e24703670a9a48b7 at term 284
2018-08-15 00:17:08.818363 I | raft: e24703670a9a48b7 [logterm: 2, index: 22746] sent MsgVote request to b5885b66a1abce99 at term 284
2018-08-15 00:17:10.118270 I | raft: e24703670a9a48b7 is starting a new election at term 284
2018-08-15 00:17:10.118304 I | raft: e24703670a9a48b7 became candidate at term 285
2018-08-15 00:17:10.118317 I | raft: e24703670a9a48b7 received MsgVoteResp from e24703670a9a48b7 at term 285
2018-08-15 00:17:10.118326 I | raft: e24703670a9a48b7 [logterm: 2, index: 22746] sent MsgVote request to b5885b66a1abce99 at term 285
2018-08-15 00:17:11.818242 I | raft: e24703670a9a48b7 is starting a new election at term 285
2018-08-15 00:17:11.818297 I | raft: e24703670a9a48b7 became candidate at term 286
2018-08-15 00:17:11.818320 I | raft: e24703670a9a48b7 received MsgVoteResp from e24703670a9a48b7 at term 286
2018-08-15 00:17:11.818338 I | raft: e24703670a9a48b7 [logterm: 2, index: 22746] sent MsgVote request to b5885b66a1abce99 at term 286
2018-08-15 00:17:12.139556 E | etcdserver: publish error: etcdserver: request timed out
2018-08-15 00:17:12.818177 I | raft: e24703670a9a48b7 is starting a new election at term 286
2018-08-15 00:17:12.818213 I | raft: e24703670a9a48b7 became candidate at term 287
2018-08-15 00:17:12.818224 I | raft: e24703670a9a48b7 received MsgVoteResp from e24703670a9a48b7 at term 287
2018-08-15 00:17:12.818232 I | raft: e24703670a9a48b7 [logterm: 2, index: 22746] sent MsgVote request to b5885b66a1abce99 at term 287
2018-08-15 00:17:13.149332 W | rafthttp: health check for peer b5885b66a1abce99 could not connect: dial tcp 10.148.179.221:2380: getsockopt: connection refused
2018-08-15 00:17:14.618222 I | raft: e24703670a9a48b7 is starting a new election at term 287
2018-08-15 00:17:14.618274 I | raft: e24703670a9a48b7 became candidate at term 288
2018-08-15 00:17:14.618293 I | raft: e24703670a9a48b7 received MsgVoteResp from e24703670a9a48b7 at term 288
2018-08-15 00:17:14.618311 I | raft: e24703670a9a48b7 [logterm: 2, index: 22746] sent MsgVote request to b5885b66a1abce99 at term 288
2018-08-15 00:17:16.018158 I | raft: e24703670a9a48b7 is starting a new election at term 288
2018-08-15 00:17:16.018189 I | raft: e24703670a9a48b7 became candidate at term 289
2018-08-15 00:17:16.018202 I | raft: e24703670a9a48b7 received MsgVoteResp from e24703670a9a48b7 at term 289
2018-08-15 00:17:16.018210 I | raft: e24703670a9a48b7 [logterm: 2, index: 22746] sent MsgVote request to b5885b66a1abce99 at term 289
2018-08-15 00:17:17.218373 I | raft: e24703670a9a48b7 is starting a new election at term 289
2018-08-15 00:17:17.218437 I | raft: e24703670a9a48b7 became candidate at term 290
2018-08-15 00:17:17.218457 I | raft: e24703670a9a48b7 received MsgVoteResp from e24703670a9a48b7 at term 290
2018-08-15 00:17:17.218475 I | raft: e24703670a9a48b7 [logterm: 2, index: 22746] sent MsgVote request to b5885b66a1abce99 at term 290
2018-08-15 00:17:18.149545 W | rafthttp: health check for peer b5885b66a1abce99 could not connect: dial tcp 10.148.179.221:2380: getsockopt: connection refused
2018-08-15 00:17:19.018156 I | raft: e24703670a9a48b7 is starting a new election at term 290
2018-08-15 00:17:19.018184 I | raft: e24703670a9a48b7 became candidate at term 291
2018-08-15 00:17:19.018194 I | raft: e24703670a9a48b7 received MsgVoteResp from e24703670a9a48b7 at term 291
2018-08-15 00:17:19.018207 I | raft: e24703670a9a48b7 [logterm: 2, index: 22746] sent MsgVote request to b5885b66a1abce99 at term 291
2018-08-15 00:17:19.139879 E | etcdserver: publish error: etcdserver: request timed out

Also, before running the above exec command on master2, I noticed on master 1 that etcd pod is never running successfully.

 kubectl get pods -n kube-system
 NAME                                          READY     STATUS    RESTARTS   AGE
coredns-78fcdf6894-47jmm                      1/1       Running   0          50m
coredns-78fcdf6894-pzl2p                      1/1       Running   0          50m
etcd-kashif-test-master-3-2056620             0/1       Pending   0          1s
kube-proxy-7htzb                              1/1       Running   0          50m
kube-scheduler-kashif-test-master-3-2056620   0/1       Pending   0          1s
weave-net-6ngsx                               2/2       Running   0          47m

If you notice, the etcd pod is in Pending state and never runs. So either the problem is there even before running the exec or it starts after.

-- kk1957
etcd
kubernetes
master

1 Answer

8/15/2018

So what happened in my case was that the config was incorrect. Both masters were being assigned the same name under nodeRegistration section of the config. Mysterious ways this can fail in. Also I was running things on docker 17.03 which is the max officially supported version, but I got it working with 17.12. However, I do not think this is the reason of success. Its the config.

-- kk1957
Source: StackOverflow