What does the requirement "There must be at least three nodes." mean when deploying GlusterFS in Kubernetes

2/3/2019

I am currently testing a Kubernetes+GlusterFS+Elastic setup with VirtualBox Ubuntu VMs. I have two, one master and one node. The guide for GlusterFS here says:

"There must be at least three nodes"

I am not sure what does that mean. To elaborate a bit on what I don't understand:

  • Are we talking about three Kubernetes pods running GlusterFS within the greater Kubernetes cluster, or three GlusterFS nodes within the GlusterFS cluster that exists within the Kubernetes cluster as a single pod?
  • Can this requirement be met by my current setup? I was thinking of attaching 1 raw disk to the node VM with size, let's say 90GB. Then I would split that raw space to three /dev/sda2, /dev/sda3, /dev/sda4, each one 30GB, without formatting them. Now this Virtualbox VM node is also a Kubernetes node. I am assuming that Kubernetes will auto-mount those three raw spaces in the virtual world. Then, if we are talking about three Kubernetes pods, each one will have it's own virtual pod IP. As a result, I would be able to declare IP-/dev/sda# pairs in the topology file mentioned in the guide in the storage and device field respectively. In my case I will have only one device per storage IP. Does my logic make sense?
-- Tasos
glusterfs
kubernetes
ubuntu
virtualbox

1 Answer

2/3/2019

Are we talking about three Kubernetes pods running GlusterFS within the greater Kubernetes cluster, or three GlusterFS nodes within the GlusterFS cluster that exists within the Kubernetes cluster as a single pod?

Those requirements are related to the cluster quorum of GlusterFS. Those requirements implies different nodes for the High Availability GlusterFS. In the case you have a real Kubernetes cluster with different nodes you would deploy GlusterFS in each of the nodes so the data is safely replicated across all the nodes, and you would be safe in the case you loose one node.

Here you have the GlusterFS reference, hope its useful.

Can this requirement be met by my current setup? I was thinking of attaching 1 raw disk to the node VM with size, let's say 90GB. Then I would split that raw space to three /dev/sda2, /dev/sda3, /dev/sda4, each one 30GB, without formatting them. Now this Virtualbox VM node is also a Kubernetes node. I am assuming that Kubernetes will auto-mount those three raw spaces in the virtual world. Then, if we are talking about three Kubernetes pods, each one will have it's own virtual pod IP. As a result, I would be able to declare IP-/dev/sda# pairs in the topology file mentioned in the guide in the storage and device field respectively. In my case I will have only one device per storage IP. Does my logic make sense?

I think you won't be able to do this. If you look closer at the deployment manifests of GlusterFS:

---
kind: DaemonSet
apiVersion: extensions/v1beta1
metadata:
  name: glusterfs
  labels:
    glusterfs: daemonset
  annotations:
    description: GlusterFS DaemonSet
    tags: glusterfs
spec:

Ref.

You can see that the deployment method of GlusterFS it is a Daemonset, which means that you will have one pod per node. In any case, looking at the Best practices section of the GlusterFS reference I shared above it says you can deploy a cluster of GlusterFS with two nodes. You won't have that third node for disaster recovery, but you will be able to test it.

Hope this clarifies things to you :)

-- David Barranco
Source: StackOverflow