StatefulSet breaking Kafka on worker reboot (unordered start)

8/7/2020

In a worker node reboot scenario (1.14.3), does the order of starting stateful sets pods matter, I have a confluent kafka (5.5.1) situation where 1 member start a lot before 0 and a bit ahead of 2, as a result I see a lot of crashes on 0 is there some mechanic here that breaks things? Starting is ordinal and delete is reversed, but what happens when order is broken?

  Started:      Sun, 02 Aug 2020 00:52:54 +0100 kafka-0
  Started:      Sun, 02 Aug 2020 00:50:25 +0100 kafka-1
  Started:      Sun, 02 Aug 2020 00:50:26 +0100 kafka-2
  Started:      Sun, 02 Aug 2020 00:28:53 +0100 zk-0
  Started:      Sun, 02 Aug 2020 00:50:29 +0100 zk-1
  Started:      Sun, 02 Aug 2020 00:50:19 +0100 zk-2
-- anVzdGFub3RoZXJodW1hbg
apache-kafka
confluent-platform
kubernetes
kubernetes-statefulset

0 Answers