I am new to ceph and using rook to install ceph in k8s cluster. I see that pod rook-ceph-osd-prepare is in Running status forever and stuck on below line:
2020-06-15 20:09:02.260379 D | exec: Running command: ceph auth get-or-create-key
client.bootstrap-osd mon allow profile bootstrap-osd --connect-timeout=15 --cluster=rook-ceph
--conf=/var/lib/rook/rook-ceph/rook-ceph.config
--name=client.admin --keyring=/var/lib/rook/rook-ceph/client.admin.keyring
--format json --out-file /tmp/180401029
When I logged into container and ran the same command, I see that its stuck too and after pressing ^C it showed this:
Traceback (most recent call last):
File "/usr/bin/ceph", line 1266, in <module>
retval = main()
File "/usr/bin/ceph", line 1197, in main
verbose)
File "/usr/bin/ceph", line 622, in new_style_command
ret, outbuf, outs = do_command(parsed_args, target, cmdargs, sigdict, inbuf, verbose)
File "/usr/bin/ceph", line 596, in do_command
return ret, '', ''
Below are all my pods:
rook-ceph csi-cephfsplugin-9k9z2 3/3 Running 0 9h
rook-ceph csi-cephfsplugin-mjsbk 3/3 Running 0 9h
rook-ceph csi-cephfsplugin-mrqz5 3/3 Running 0 9h
rook-ceph csi-cephfsplugin-provisioner-5ffbdf7856-59cf7 5/5 Running 0 9h
rook-ceph csi-cephfsplugin-provisioner-5ffbdf7856-m4bhr 5/5 Running 0 9h
rook-ceph csi-cephfsplugin-xgvz4 3/3 Running 0 9h
rook-ceph csi-rbdplugin-6k4dk 3/3 Running 0 9h
rook-ceph csi-rbdplugin-klrwp 3/3 Running 0 9h
rook-ceph csi-rbdplugin-provisioner-68d449986d-2z9gr 6/6 Running 0 9h
rook-ceph csi-rbdplugin-provisioner-68d449986d-mzh9d 6/6 Running 0 9h
rook-ceph csi-rbdplugin-qcmrj 3/3 Running 0 9h
rook-ceph csi-rbdplugin-zdg8z 3/3 Running 0 9h
rook-ceph rook-ceph-crashcollector-k8snode001-76ffd57d58-slg5q 1/1 Running 0 9h
rook-ceph rook-ceph-crashcollector-k8snode002-85b6d9d699-s8m8z 1/1 Running 0 9h
rook-ceph rook-ceph-crashcollector-k8snode004-847bdb4fc5-kk6bd 1/1 Running 0 9h
rook-ceph rook-ceph-mgr-a-5497fcbb7d-lq6tf 1/1 Running 0 9h
rook-ceph rook-ceph-mon-a-6966d857d9-s4wch 1/1 Running 0 9h
rook-ceph rook-ceph-mon-b-649c6845f4-z46br 1/1 Running 0 9h
rook-ceph rook-ceph-mon-c-67869b76c7-4v6zn 1/1 Running 0 9h
rook-ceph rook-ceph-operator-5968d8f7b9-hsfld 1/1 Running 0 9h
rook-ceph rook-ceph-osd-prepare-k8snode001-j25xv 1/1 Running 0 7h48m
rook-ceph rook-ceph-osd-prepare-k8snode002-6fvlx 0/1 Completed 0 9h
rook-ceph rook-ceph-osd-prepare-k8snode003-cqc4g 0/1 Completed 0 9h
rook-ceph rook-ceph-osd-prepare-k8snode004-jxxtl 0/1 Completed 0 9h
rook-ceph rook-discover-28xj4 1/1 Running 0 9h
rook-ceph rook-discover-4ss66 1/1 Running 0 9h
rook-ceph rook-discover-bt8rd 1/1 Running 0 9h
rook-ceph rook-discover-q8f4x 1/1 Running 0 9h
Please let me know if anyone has any hints to resolve this or troubleshoot this?
In my case, the problem is my Kubernetes host is not in the same kernel version.
Once I upgraded the kernel version to match with all the other nodes, this issue is resolved.
In my case, one of my nodes system clock not synchronized with hardware so there was a time gap between nodes.
maybe you should check output of timedatectl command.