Every article I've seen on the internet about running a Cassandra cluster on Kubernetes were either omitting the podManagementPolicy
field or setting it to OrderedReady
which is essentially the same thing because it is the default value.
I was wondering if it is possible to use podManagementPolicy: Parallel
to speed up the process of synchronizing when multiple nodes of the Cassandra cluster restart.
PodManagement parallel is still a safe operation. As Cassandra forms a ring and does not have master/slave architecture, starting the pods in parallel is valid. However as long as the seed pods are not started, the rest of the replicas would join a crash loop until seed pods are up and healthy. Might take a few restarts but they should be up.
As far as I know, this is a bad idea. I tried it and got the last node going into CrashLoopBackoff. It seems like the reason is that joining nodes crash if they see another node is trying to join at the same time.
podManagementPolicy: OrderedReady
should be the way to go.
Yes that works fine. We are using podManagementPolicy: Parallel
in our every statefulsets which includes cassandra cluster also. This really helped us in the entire cluster restart scenario where all the pods comes up at the same time and sync.
Use case of podManagementPolicy: Parallel
in our cluster:
We have 3 node baremetal K8s cluster and 3 node cassandra cluster on top of it leveraging the local-storage
of the node for PV. In case of local-storage
PV is bound to node. So if we set podManagementPolicy: OrderedReady
then the issue is, if we bring down 2 node of cluster which have lets say cds-pod-1
and cds-pod-2
, both of them goes into unknown state. Now lets say we bring the node up where cds-pod-2
reside then it doesn't bring that pod up because it needs the cds-pod-1
to be in running state to bring cds-pod-2
to running state. Hence we have to change the podManagementPolicy: Parallel
and then you can bring up the pod in any fashion and do not depends on the order.