Is Stateful set for single replica an overkill?

12/22/2019

I need to run my application with "at most once" semantics. It is absolutely crucial that only one instance of my app is running at any given time or none at all

At first I was using resource type "deployment" with single replica but then I realized the during network partition we might inadvertently be running more than one instances.

I stumbled up on "stateful sets" while I was searching for most once semantics in kubernetes. On reading further, the examples dealt with cases where the containers needed a persistent volume and typically these containers were running with more than one replica. My application is not even using any volumes.

I also read about tolerations to kill the pod if the node is unreachable. Given that tolerations can handle pod unreachable cases, is stateful set an overkill for my usecase?

I am justifying the use stateful set - because even in that mean time the node becomes unreachable and toleration seconds reached and kubelet realizes that it is cut off from the network and kills the processes, kubernetes can spin up another instance. And I believe stateful set prevent this corner case too.

Am I right? Is there any other approach to achieve this?

-- Vinodhini Chockalingam
kubernetes
kubernetes-statefulset

3 Answers

12/23/2019

To quote a Kubernetes doc:

...StatefulSets maintain a sticky, stable identity for their Pods...Guaranteeing an identity for each Pod helps avoid split-brain side effects in the case when a node becomes unreachable (network partition).

As described in the same doc, StatefulSet Pods on a Node are marked as "Unknown" and aren't rescheduled unless forcefully deleted when a node becomes unreachable. Something to consider for proper recovery, if going this route.

So, yes - StatefulSet may be more suitable for the given use case than Deployment.

In my opinion, it won't be an overkill to use StatefulSet - choose the Kubernetes object that works best for your use case.

-- apisim
Source: StackOverflow

12/22/2019

It is absolutely crucial that only one instance of my app is running at any given time.

Use a leader election pattern to guarantee at most one active replica. If you use more than one replica and leader election, the other replicas are stand by in case of network partition situations. This is how the components in the Kubernetes control plane solves this problem when needing only one active instance.

Leader election algorithms in Kubernetes usually work by taking a lock (e.g. in etcd) with a timeout. Only the instance that has the lock is active. When the lock is timed out, the leader election algorithm either extend the lock timeout or elect a new leader. How it works depends on the implementation but there is a guarantee that there is at most one leader - the active instance.

See e.g. Simple Leader Election with Kubernetes that also describe how to solve this in a side car container.

If your application is stateless, you should use Deployment and not StatefulSet. It can appear that StatefulSet is a way to solve at most one instance in a situation with a network partition, but that is mostly in case of a stateful replicated application like e.g. a cache or database cluster even though it may solve your specific situation as well.

-- Jonas
Source: StackOverflow

12/23/2019
  • Statefulsets are not the recourse for at-most one semantics - they are typically used for deploying "state full" applications like databases which use the persistent identity of their pods to cluster among themselves

  • We have faced similar issues like what you mentioned - we had implicitly assumed that a old pod would be fully deleted before bringing up the new instance

  • One option is to use the combination of preStop hooks + init-containers combination

  • preStop hook will do the necessary cleanup (say delete a app specific etcd key)

  • Init container can wait till the etcd key disappears (with an upper bound).

References:

https://kubernetes.io/docs/concepts/workloads/pods/pod/#termination-of-pods https://kubernetes.io/docs/concepts/workloads/pods/init-containers/

One alternative is to try with the anti-affinity settings, but i am not very sure about this one though

-- pr-pal
Source: StackOverflow