I am toying around with Kubernetes and have managed to deploy a statefull application (jenkins instance) to a single node. It uses a PVC to make sure that I can persist my jenkins data (jobs, plugins etc).
Now I would like to experiment with failover.
My cluster has 2 digital ocean droplets.
Currently my jenkins pod is running on just one node. When that goes down, Jenkins becomes unavailable.
I am now looking on how to accomplish failover in a sense that, when the jenkins pod goes down on my node, it will spin up on the other node. (so short downtime during this proces is ok).
Of course it has to use the same PVC, so that my data remains intact.
I believe, when reading, that a StatefulSet kan be used for this?
Any pointers are much appreciated!
Best regards
What you are describing is the default behaviour of Kubernetes for Pods that are managed by a controller, such as a Deployment.
You should deploy any application as a Deployment (or another controller) even if it consists just of a single Pod. You never really deploy Pods directly to Kubernetes. So, in this case, there's nothing special you need to do to get this behaviour.
When one of your nodes dies, the Pod dies too. This is detected by the Deployment controller, which creates a new Pod. This is in turn detected by the scheduler, which assigns the new Pod to a node. Since one of the nodes is down, it will assign the Pod to the other node that is still running. Once the Pod is assigned to this node, the kubelet of this node will run the container(s) of this Pod on this node.
Ok, let me try to anwser my own question here. I think Amit Kumar Gupta
came the closest to what I believe is going on here.
Since I am using a Deployment and my PVC in ReadWriteOnce
, I am basically stuck with one pod, running jenkins, on one node.
weibelds
answer made me realise that I was asking questions to about a concept that Kubernetes performs by default. If my pod goes down (in my case i am shutting down a node on purpose by doing a hard power down to simulate a failure), the cluster (controller?) will detect this and spawn a new pod on another node.
All is fine so far, but then I noticed that my new pod as stuck in ContainerCreating
state.
Running a describe
on my new pod (the one in ContainerCreating
state) showed this
Warning FailedAttachVolume 16m attachdetach-controller Multi-Attach error for volume "pvc-cb772fdb-492b-4ef5-a63e-4e483b8798fd" Volume is already used by pod(s) jenkins-deployment-6ddd796846-dgpnm
Warning FailedMount 70s (x7 over 14m) kubelet, cc-pool-bg6u Unable to mount volumes for pod "jenkins-deployment-6ddd796846-wjbkl_default(93747d74-b208-421c-afa4-8d467e717649)": timeout expired waiting for volumes to attach or mount for pod "default"/"jenkins-deployment-6ddd796846-wjbkl". list of unmounted volumes=[jenkins-home]. list of unattached volumes=[jenkins-home default-token-wd6p7]
Then it started to hit me, this makes sense. It's a pitty, but it makes sense.
Since I did a hard power down on the node, the PV went down with it. So now the controller tries to start a new pod, on a new node but it cant transfer the PV, since the one on the previous pod became unreachable.
As I read more on this, I read that DigitalOcean only supports ReadWriteOnce
, which now leaves me wondering, how the hell can I achieve a simple failover for a stateful application on a Kubernetes Cluster on Digital Ocean that consists of just a couple of simple droplets?
Yes, you would use a Deployment or StatefulSet depending on the use case. For Jenkins, a StatefulSet would be appropriate. If the running pod becomes unavailable, the StatefulSet controller will see that and spawn a new one.
Digital Ocean's Kubernetes service only supports ReadWriteOnce
access modes for PVCs (see here). This means the volume can only be attached to one node at a time.
I came across this blogpost which, while focused on Jenkins on Azure, has the same situation of only supporting ReadWriteOnce
. The author states:
the drawback for me though lies in the fact that the access mode for Azure Disk persistent volumes is ReadWriteOnce. This means that an Azure disk can be attached to only one cluster node at a time. In the event of a node failure or update, it could take anywhere between 1-5 minutes for the Azure disk to get detached and attached to the next available node.
Note, Pod
failure and node failures are different things. Since DO only supports ReadWriteOnce
, there's no benefit to trying anything more sophisticated than what you have right now in terms of tolerance to node failure. Since it's ReadWriteOnce
the volume will need to be unmounted from the failing node and re-mounted to the new node, and then a new Pod
will get scheduled on the new node. Kubernetes will do this for you, and there's not much you can do to optimize it.
For Pod
failure, you could use a Deployment
since you want to read and write the same data, you don't want different PVs attached to the different replicas. There may be very limited benefit to this, you will have multiple replicas of the Pod
all running on the same node, so it depends on how the Jenkins process scales and if it can support that type of scale horizontal out model while all writing to the same volume (as opposed to simply vertically scaling memory or CPU requests).
If you really want to achieve higher availability in the face of node and/or Pod
failures, and the Jenkins workload you're deploying has a hard requirement on local volumes for persistent state, you will need to consider an alternative volume plugin like NFS, or moving to a different cloud provider like GKE.