Recovering from Kubernetes node failure running Cassandra

2/8/2018

I'm looking for a good solution to replace dead Kubernetes worker node that was running Cassandra in Kubernetes.

Scenario:

Cassandra cluster built from 3 pods
Failure occurs on one of the Kubernetes worker nodes
Replacement node is joining the cluster
New pod from StatefulSet is scheduled on new node
As pod IP address has changed, new pod is visible as new Cassandra node (4 nodes in cluster in total) and is unable to bootstrap until the dead one is removed.

It's very difficult to follow the official procedure, as Cassandra is running as StatefulSet.

One completely hacky workaround I've found is to use ConfigMap to supply JAVA_OPTS. As changing ConfigMap doesn't recreate pods (yet), you can manipulate running pods in such way that you will be able to follow the procedure.

However, that's, as I mentioned, super hacky. I'm wondering if anyone is running Cassandra on top of Kubernetes and has a better idea how to deal with such failure?

-- Kamil Szczygieł

cassandra

kubernetes

2 Answers

2/9/2018

Jetstack navigator supports this, but it's currently in alpha:

https://github.com/jetstack/navigator

-- jaxxstorm

Source: StackOverflow

2/9/2018

unable to bootstrap until the dead one is removed. Why is that? I use the statefulset and I'm able to kill a pod and have a new one join in

-- VinceMD

Source: StackOverflow

K
Q

Recovering from Kubernetes node failure running Cassandra

Similar Questions

2 Answers

KQ

Recovering from Kubernetes node failure running Cassandra

Similar Questions

2 Answers

K
Q