I was planning to migrate from replica 3 to replica 3 with arbiter 1, but faced with a strange issue on my third node(that acts as arbiter).
When I mounted new volume endpoint to the node where Gluster arbiter POD is running I'm getting strange behavior: some files are fine, but some are zero sizes. When I mount the same share on another node, all files are fine.
GlusterFS is running as a Kubernetes daemonset and I'm using heketi to manage GlusterFS from Kubernetes automatically.
I'm using glusterfs 4.1.5 and Kubernetes 1.11.1.
gluster volume info vol_3ffdfde93880e8aa39c4b4abddc392cf
Type: Replicate
Volume ID: e67d2ade-991a-40f9-8f26-572d0982850d
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: 192.168.2.70:/var/lib/heketi/mounts/vg_426b28072d8d0a4c27075930ddcdd740/brick_35389ca30d8f631004d292b76d32a03b/brick
Brick2: 192.168.2.96:/var/lib/heketi/mounts/vg_3a9b2f229b1e13c0f639db6564f0d820/brick_953450ef6bc25bfc1deae661ea04e92d/brick
Brick3: 192.168.2.148:/var/lib/heketi/mounts/vg_7d1e57c2a8a779e69d22af42812dffd7/brick_b27af182cb69e108c1652dc85b04e44a/brick (arbiter)
Options Reconfigured:
user.heketi.id: 3ffdfde93880e8aa39c4b4abddc392cf
user.heketi.arbiter: true
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off
Status Output:
gluster volume status vol_3ffdfde93880e8aa39c4b4abddc392cf
Status of volume: vol_3ffdfde93880e8aa39c4b4abddc392cf
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick 192.168.2.70:/var/lib/heketi/mounts/v
g_426b28072d8d0a4c27075930ddcdd740/brick_35
389ca30d8f631004d292b76d32a03b/brick 49152 0 Y 13896
Brick 192.168.2.96:/var/lib/heketi/mounts/v
g_3a9b2f229b1e13c0f639db6564f0d820/brick_95
3450ef6bc25bfc1deae661ea04e92d/brick 49152 0 Y 12111
Brick 192.168.2.148:/var/lib/heketi/mounts/
vg_7d1e57c2a8a779e69d22af42812dffd7/brick_b
27af182cb69e108c1652dc85b04e44a/brick 49152 0 Y 25045
Self-heal Daemon on localhost N/A N/A Y 25069
Self-heal Daemon on worker1-aws-va N/A N/A Y 12134
Self-heal Daemon on 192.168.2.70 N/A N/A Y 13919
Task Status of Volume vol_3ffdfde93880e8aa39c4b4abddc392cf
------------------------------------------------------------------------------
There are no active volume tasks
Heal output:
gluster volume heal vol_3ffdfde93880e8aa39c4b4abddc392cf info
Brick 192.168.2.70:/var/lib/heketi/mounts/vg_426b28072d8d0a4c27075930ddcdd740/brick_35389ca30d8f631004d292b76d32a03b/brick
Status: Connected
Number of entries: 0
Brick 192.168.2.96:/var/lib/heketi/mounts/vg_3a9b2f229b1e13c0f639db6564f0d820/brick_953450ef6bc25bfc1deae661ea04e92d/brick
Status: Connected
Number of entries: 0
Brick 192.168.2.148:/var/lib/heketi/mounts/vg_7d1e57c2a8a779e69d22af42812dffd7/brick_b27af182cb69e108c1652dc85b04e44a/brick
Status: Connected
Number of entries: 0
Any ideas how to resolve this issue?
The issues were fixed after updating glusterfs-client and glusterfs-common packages on Kubernetes Workers to a more recent version.