rook-ceph-osd-prepare pod stuck for hours

6/16/2020

I am new to ceph and using rook to install ceph in k8s cluster. I see that pod rook-ceph-osd-prepare is in Running status forever and stuck on below line:

2020-06-15 20:09:02.260379 D | exec: Running command: ceph auth get-or-create-key 
client.bootstrap-osd mon allow profile bootstrap-osd --connect-timeout=15 --cluster=rook-ceph 
--conf=/var/lib/rook/rook-ceph/rook-ceph.config 
--name=client.admin --keyring=/var/lib/rook/rook-ceph/client.admin.keyring 
--format json --out-file /tmp/180401029

When I logged into container and ran the same command, I see that its stuck too and after pressing ^C it showed this:

Traceback (most recent call last):
  File "/usr/bin/ceph", line 1266, in <module>
    retval = main()
  File "/usr/bin/ceph", line 1197, in main
    verbose)
  File "/usr/bin/ceph", line 622, in new_style_command
    ret, outbuf, outs = do_command(parsed_args, target, cmdargs, sigdict, inbuf, verbose)
  File "/usr/bin/ceph", line 596, in do_command
    return ret, '', ''

Below are all my pods:

rook-ceph              csi-cephfsplugin-9k9z2                                3/3     Running            0          9h
rook-ceph              csi-cephfsplugin-mjsbk                                3/3     Running            0          9h
rook-ceph              csi-cephfsplugin-mrqz5                                3/3     Running            0          9h
rook-ceph              csi-cephfsplugin-provisioner-5ffbdf7856-59cf7         5/5     Running            0          9h
rook-ceph              csi-cephfsplugin-provisioner-5ffbdf7856-m4bhr         5/5     Running            0          9h
rook-ceph              csi-cephfsplugin-xgvz4                                3/3     Running            0          9h
rook-ceph              csi-rbdplugin-6k4dk                                   3/3     Running            0          9h
rook-ceph              csi-rbdplugin-klrwp                                   3/3     Running            0          9h
rook-ceph              csi-rbdplugin-provisioner-68d449986d-2z9gr            6/6     Running            0          9h
rook-ceph              csi-rbdplugin-provisioner-68d449986d-mzh9d            6/6     Running            0          9h
rook-ceph              csi-rbdplugin-qcmrj                                   3/3     Running            0          9h
rook-ceph              csi-rbdplugin-zdg8z                                   3/3     Running            0          9h
rook-ceph              rook-ceph-crashcollector-k8snode001-76ffd57d58-slg5q   1/1    Running            0          9h
rook-ceph              rook-ceph-crashcollector-k8snode002-85b6d9d699-s8m8z   1/1    Running            0          9h
rook-ceph              rook-ceph-crashcollector-k8snode004-847bdb4fc5-kk6bd   1/1    Running            0          9h
rook-ceph              rook-ceph-mgr-a-5497fcbb7d-lq6tf                      1/1     Running            0          9h
rook-ceph              rook-ceph-mon-a-6966d857d9-s4wch                      1/1     Running            0          9h
rook-ceph              rook-ceph-mon-b-649c6845f4-z46br                      1/1     Running            0          9h
rook-ceph              rook-ceph-mon-c-67869b76c7-4v6zn                      1/1     Running            0          9h
rook-ceph              rook-ceph-operator-5968d8f7b9-hsfld                   1/1     Running            0          9h
rook-ceph              rook-ceph-osd-prepare-k8snode001-j25xv                 1/1    Running            0          7h48m
rook-ceph              rook-ceph-osd-prepare-k8snode002-6fvlx                 0/1    Completed          0          9h
rook-ceph              rook-ceph-osd-prepare-k8snode003-cqc4g                 0/1    Completed          0          9h
rook-ceph              rook-ceph-osd-prepare-k8snode004-jxxtl                 0/1    Completed          0          9h
rook-ceph              rook-discover-28xj4                                   1/1     Running            0          9h
rook-ceph              rook-discover-4ss66                                   1/1     Running            0          9h
rook-ceph              rook-discover-bt8rd                                   1/1     Running            0          9h
rook-ceph              rook-discover-q8f4x                                   1/1     Running            0          9h

Please let me know if anyone has any hints to resolve this or troubleshoot this?

-- raj_arni
ceph
kubernetes
rook-storage
storage

2 Answers

9/15/2020

In my case, the problem is my Kubernetes host is not in the same kernel version.
Once I upgraded the kernel version to match with all the other nodes, this issue is resolved.

-- Yuyanto
Source: StackOverflow

5/26/2021

In my case, one of my nodes system clock not synchronized with hardware so there was a time gap between nodes.

maybe you should check output of timedatectl command.

-- IeuD
Source: StackOverflow