After kubectl apply -f cluster.yaml (yaml example file from rook GitHub repository), I have only one pod rook-ceph-mon-a-*** running, even if I wait 1 hour. How can I investigate this problem?
NAME READY STATUS RESTARTS AGE
rook-ceph-mon-a-7ff4fd545-qc2wl 1/1 Running 0 20m
And below the logs of the single running pod
$ kubectl logs rook-ceph-mon-a-7ff4fd545-qc2wl -n rook-ceph
2019-01-14 17:23:40.578 7f725478c140 0 ceph version 13.2.2
***
No filesystems configured
2019-01-14 17:23:40.643 7f723a050700 1 mon.a@0(leader).paxosservice(auth 0..0) refresh upgraded, format 3 -> 0
2019-01-14 17:23:40.643 7f723a050700 0 log_channel(cluster) log [DBG] : fsmap
2019-01-14 17:23:40.645 7f723a050700 0 mon.a@0(leader).osd e1 crush map has features 288514050185494528, adjusting msgr requires
2019-01-14 17:23:40.645 7f723a050700 0 mon.a@0(leader).osd e1 crush map has features 288514050185494528, adjusting msgr requires
2019-01-14 17:23:40.645 7f723a050700 0 mon.a@0(leader).osd e1 crush map has features 1009089990564790272, adjusting msgr requires
2019-01-14 17:23:40.645 7f723a050700 0 mon.a@0(leader).osd e1 crush map has features 288514050185494528, adjusting msgr requires
2019-01-14 17:23:40.643443 mon.a unknown.0 - 0 : [INF] mkfs cb8db53e-2d36-42eb-ab25-2a0918602655
2019-01-14 17:23:40.645 7f723a050700 1 mon.a@0(leader).paxosservice(auth 1..1) refresh upgraded, format 0 -> 3
2019-01-14 17:23:40.647 7f723a050700 0 log_channel(cluster) log [DBG] : osdmap e1: 0 total, 0 up, 0 in
2019-01-14 17:23:40.648 7f723a050700 0 log_channel(cluster) log [DBG] : mgrmap e1: no daemons active
2019-01-14 17:23:40.635473 mon.a mon.0 10.32.0.43:6790/0 1 : cluster [INF] mon.a is new leader, mons a in quorum (ranks 0)
2019-01-14 17:23:40.641926 mon.a mon.0 10.32.0.43:6790/0 2 : cluster [INF] mon.a is new leader, mons a in quorum (ranks 0)
Maybe your old data(/var/lib/rook) is not empty, i encounter the error, and i delete these files. it's working!
Assuming you have followed the official ceph-quickstart guide from rook`s Github page here, please check first for the problematic pods with command:
kubectl -n rook-ceph get pod
and retrieve from them logs with:
kubectl logs <pod_name>
Please update your original question to include these command outputs.