I have a k8s cluster that runs just fine. It has several standalone mongodb statefulsets connected via NFC. The problem is, whenever their is a power outage, the mongodb databases get corrupt:
{"t":{"$date":"2021-10-15T13:10:06.446+00:00"},"s":"W", "c":"STORAGE", "id":22271, "ctx":"initandlisten","msg":"Detected unclean shutdown - Lock file is not empty","attr":{"lockFile":"/data/db/mongod.lock"}}
{"t":{"$date":"2021-10-15T13:10:07.182+00:00"},"s":"E", "c":"STORAGE", "id":22435, "ctx":"initandlisten","msg":"WiredTiger error","attr":{"error":0,"message":"[1634303407:182673][1:0x7f9515eb7a80], file:WiredTiger.wt, connection: __wt_block_read_off, 283: WiredTiger.wt: read checksum error for 4096B block at offset 12288: block header checksum of 0xc663f362 doesn't match expected checksum of 0xb8e27418"}}
The pods status remain at CrashLoopBackOff so I cannot do kubectl exec -it usersdb-0 -- mongod --repair
because it is not running.
I have tried deleting wiredTiger.lock and mongod.lock but nothing seems to work. How can I repair this databases?
well after several attempts I think I have finally made some breakthrough so I wanted to leave this here for someone else.
Since the mongodb is not running, add the command
command: ["sleep"]
args: ["infinity"]
in the resource file (hoping it is a statefulset). Then repair the database using the command
kubectl exec -it <NAME-OF-MONGODB-POD> -- mongod --dbpath /data/db --repair
This will repair the standalone mongodb pod. Now remove the comment, apply the resource yaml file then kill the pod to recreate it afresh.
Now the mongodb pod should be working fine.