How to defer execution of commands within postStart lifecycle hook of kubernetes

5/20/2020

I am trying to defer execution of certain commands ( mongo replicaset initialization ) on pod-0. Basically I want to wait for other pods to start. The postStart hook which I am using, looks like this :

- /bin/bash
- -c
- >
  if [ ! -f /xyz/mongo/data/init.lock ];
  then
    if [ "$HOSTNAME" = "mongod-0" ];
    then
      MONGO_CMD="rs.initiate({ _id : \"my_mdb_rs0\", members: [ { _id: 0, host : \"mongod-0.mongodb-service.default.svc.cluster.local:27017\" }, { _id: 1, host : \"mongod-1.mongodb-service.default.svc.cluster.local:27017\" } ] });";

      **/xyz/mongo/config/doinit.sh \""$MONGO_CMD\"" 120 \& ;**

    fi;
  else
    echo "Replicaset appears to be initialized" >> /xyz/mongo/log/init-log.log;
  fi;

I try to fork a child process for "doinit.sh" , which internally sleeps for "120" secs before executing the initializaiton command.

However the pod-0 continues to wait for 120 second in "ContainerCreating" and then goes on to schedule the next pod.

NAME       READY   STATUS              RESTARTS   AGE
mongod-0   0/1     ContainerCreating   0          2m16s

NAME       READY   STATUS              RESTARTS   AGE
mongod-0   1/1     Running             0          2m49s
mongod-1   0/1     ContainerCreating   0          5s

Is there a way to unblock the scheduler on pod-0, so that all pods are scheduled immediately and the "doinit.sh" on first pod waits in the background ?

-- Rajesh
kubernetes
mongodb

1 Answer

5/20/2020

For similar usecase, we create a kubernetes job which will

  • wait for 3 replicas of mongo to start running. this can be done by calling kubernetes api (or) simple wait for 120s and try running the command. If that fails, job will create an another pod to run the command. we can configure the number of retries for the job.

If you are not using helm, you can put the mongo-sts + init-job in same yaml file, so while applying it will be easy (kubectl apply -f mongo-db.yaml).

If you are using helm, you can put this as post-install job.

PostStart vs Job:

  • Retries are possible in Job
  • You can easily see the logs of the pod(which was created by job) to see what went wrong in job.
  • If the PostStart hook takes too long to run or hangs, the Container cannot reach a running state. I agree, you are trying to put it in background. Job will be independent of the pod's lifecycle.
-- Dinesh Balasubramanian
Source: StackOverflow