Why am I getting "Structure needs cleaning" message on Ceph with Kubernetes?

12/5/2018

Sorry to ask this, I am relatively new in Kubernetes and Ceph, only have a little idea about this.

I have setup Kubernetes and Ceph using this tutorial(http://tutorial.kubernetes.noverit.com/content/ceph.html)

I had set up my cluster like this:

1 Kube-Master and 2 worker Nodes(this acts ceph monitor with 2 OSD in each)
The ceph-deploy I used to setup ceph cluster is in the Kube-master.

Everything is working fine, I installed my sample web application(deployment) with 5 replicas, which will create a file when the rest API is hit. The file is getting copied to every node.

But after 10 min, I created one more file using the API, but when I try to list(ls -l) I am getting the following error:

For node1:

ls: cannot access 'previousFile.txt': Structure needs cleaning
previousFile.txt  newFile.txt

For node2:

previousFile.txt  

For node2 the new file is not created

What might be the issue? I have tried many times still same error pop up.

Any help appreciated.

-- JibinNajeeb
ceph
docker
kubernetes

1 Answer

12/5/2018

This totally looks like your filesystem got corrupted. Things to check:

  • $ kubectl logs <ceph-pod1>
  • $ kubectl logs <ceph-pod2>
  • $ kubectl describe deployment <ceph-deployment> # did any of the pods restart?

Some info about the error message here.

Depending on what you have you might need to start from scratch. Or you can take a look a recovering data in Ceph, but may not work if you don't have a snap.

Running Ceph on Kubernetes can be very tricky because any start/restart for a specific node starting on a different Kubernetes node might corrupt the data, so you need to make sure that part is pretty solid possibly using Node Affinity or running Ceph pods on specific Kubernetes nodes with labels.

-- Rico
Source: StackOverflow