Helm chart stuck in PodInitializing state indefinitely

3/28/2019

I'm running a microk8s cluster on an Ubuntu server at home, and I have it connected to a local NAS server for persistent storage. I've been using it as my personal proving grounds for learning Kubernetes, but I seem to encounter problem after problem at just about every step of the way.

I've got the NFS Client Provisioner Helm chart installed which I've confirmed works - it will dynamically provision PVCs on my NAS server. I later was able to successfully install the Postgres Helm chart, or so I thought. After creating it I was able to connect to it using a SQL client, and I was feeling good.

Until a couple of days later, I noticed the pod was showing 0/1 containers ready. Although interestingly, the nfs-client-provisioner pod was still showing 1/1. Long story short: I've deleted/purged the Postgres Helm chart, and attempted to reinstall it, but now it no longer works. In fact, nothing new that I try to deploy works. Everything looks as though it's going to work, but then just hangs on either Init or ContainerCreating forever.

With Postgres in particular, the command I've been running is this:

helm install --name postgres stable/postgresql -f postgres.yaml

And my postgres.yaml file looks like this:

persistence:
    storageClass: nfs-client
    accessMode: ReadWriteMany
    size: 2Gi

But if I do a kubectl get pods I see still see this:

NAME                    READY  STATUS    RESTARTS  AGE
nfs-client-provisioner  1/1    Running   1         11d
postgres-postgresql-0   0/1    Init:0/1  0         3h51m

If I do a kubectl describe pod postgres-postgresql-0, this is the output:

Name:               postgres-postgresql-0
Namespace:          default
Priority:           0
PriorityClassName:  <none>
Node:               stjohn/192.168.1.217
Start Time:         Thu, 28 Mar 2019 12:51:02 -0500
Labels:             app=postgresql
                chart=postgresql-3.11.7
                controller-revision-hash=postgres-postgresql-5bfb9cc56d
                heritage=Tiller
                release=postgres
                role=master
                statefulset.kubernetes.io/pod-name=postgres-postgresql-0
Annotations:        <none>
Status:             Pending
IP:                 
Controlled By:      StatefulSet/postgres-postgresql
Init Containers:
  init-chmod-data:
    Container ID:  
    Image:         docker.io/bitnami/minideb:latest
    Image ID:      
    Port:          <none>
    Host Port:     <none>
    Command:
      sh
      -c
      chown -R 1001:1001 /bitnami
      if [ -d /bitnami/postgresql/data ]; then
    chmod  0700 /bitnami/postgresql/data;
      fi

    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Requests:
      cpu:        250m
      memory:     256Mi
    Environment:  <none>
    Mounts:
      /bitnami/postgresql from data (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-h4gph (ro)
Containers:
  postgres-postgresql:
    Container ID:   
    Image:          docker.io/bitnami/postgresql:10.7.0
    Image ID:       
    Port:           5432/TCP
    Host Port:      0/TCP
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Requests:
      cpu:      250m
      memory:   256Mi
    Liveness:   exec [sh -c exec pg_isready -U "postgres" -h localhost] delay=30s timeout=5s period=10s #success=1 #failure=6
    Readiness:  exec [sh -c exec pg_isready -U "postgres" -h localhost] delay=5s timeout=5s period=10s #success=1 #failure=6
    Environment:
      PGDATA:             /bitnami/postgresql
      POSTGRES_USER:      postgres
      POSTGRES_PASSWORD:  <set to the key 'postgresql-password' in secret 'postgres-postgresql'>  Optional: false
    Mounts:
      /bitnami/postgresql from data (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-h4gph (ro)
Conditions:
  Type              Status
  Initialized       False 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  data:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  data-postgres-postgresql-0
    ReadOnly:   false
  default-token-h4gph:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-h4gph
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
             node.kubernetes.io/unreachable:NoExecute for 300s
Events:          <none>

And if I do a kubectl get pod postgres-postgresql-0 -o yaml, this is the output:

apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: "2019-03-28T17:51:02Z"
  generateName: postgres-postgresql-
  labels:
    app: postgresql
    chart: postgresql-3.11.7
    controller-revision-hash: postgres-postgresql-5bfb9cc56d
    heritage: Tiller
    release: postgres
    role: master
    statefulset.kubernetes.io/pod-name: postgres-postgresql-0
  name: postgres-postgresql-0
  namespace: default
  ownerReferences:
  - apiVersion: apps/v1
    blockOwnerDeletion: true
    controller: true
    kind: StatefulSet
    name: postgres-postgresql
    uid: 0d3ef673-5182-11e9-bf14-b8975a0ca30c
  resourceVersion: "1953329"
  selfLink: /api/v1/namespaces/default/pods/postgres-postgresql-0
  uid: 0d4dfb56-5182-11e9-bf14-b8975a0ca30c
spec:
  containers:
  - env:
    - name: PGDATA
      value: /bitnami/postgresql
    - name: POSTGRES_USER
      value: postgres
    - name: POSTGRES_PASSWORD
      valueFrom:
    secretKeyRef:
      key: postgresql-password
      name: postgres-postgresql
    image: docker.io/bitnami/postgresql:10.7.0
    imagePullPolicy: Always
    livenessProbe:
      exec:
    command:
    - sh
    - -c
    - exec pg_isready -U "postgres" -h localhost
      failureThreshold: 6
      initialDelaySeconds: 30
      periodSeconds: 10
      successThreshold: 1
      timeoutSeconds: 5
    name: postgres-postgresql
    ports:
    - containerPort: 5432
      name: postgresql
      protocol: TCP
    readinessProbe:
      exec:
    command:
    - sh
    - -c
    - exec pg_isready -U "postgres" -h localhost
      failureThreshold: 6
      initialDelaySeconds: 5
      periodSeconds: 10
      successThreshold: 1
      timeoutSeconds: 5
    resources:
      requests:
    cpu: 250m
    memory: 256Mi
    securityContext:
      procMount: Default
      runAsUser: 1001
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /bitnami/postgresql
      name: data
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: default-token-h4gph
      readOnly: true
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  hostname: postgres-postgresql-0
  initContainers:
  - command:
    - sh
    - -c
    - |
      chown -R 1001:1001 /bitnami
      if [ -d /bitnami/postgresql/data ]; then
    chmod  0700 /bitnami/postgresql/data;
      fi
    image: docker.io/bitnami/minideb:latest
    imagePullPolicy: Always
    name: init-chmod-data
    resources:
      requests:
    cpu: 250m
    memory: 256Mi
    securityContext:
      procMount: Default
      runAsUser: 0
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /bitnami/postgresql
      name: data
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: default-token-h4gph
      readOnly: true
  nodeName: stjohn
  priority: 0
  restartPolicy: Always
  schedulerName: default-scheduler
  securityContext:
    fsGroup: 1001
  serviceAccount: default
  serviceAccountName: default
  subdomain: postgres-postgresql-headless
  terminationGracePeriodSeconds: 30
  tolerations:
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
    tolerationSeconds: 300
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
    tolerationSeconds: 300
  volumes:
  - name: data
    persistentVolumeClaim:
      claimName: data-postgres-postgresql-0
  - name: default-token-h4gph
    secret:
      defaultMode: 420
      secretName: default-token-h4gph
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2019-03-28T17:51:02Z"
    message: 'containers with incomplete status: [init-chmod-data]'
    reason: ContainersNotInitialized
    status: "False"
    type: Initialized
  - lastProbeTime: null
    lastTransitionTime: "2019-03-28T17:51:02Z"
    message: 'containers with unready status: [postgres-postgresql]'
    reason: ContainersNotReady
    status: "False"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: "2019-03-28T17:51:02Z"
    message: 'containers with unready status: [postgres-postgresql]'
    reason: ContainersNotReady
    status: "False"
    type: ContainersReady
  - lastProbeTime: null
    lastTransitionTime: "2019-03-28T17:51:02Z"
    status: "True"
    type: PodScheduled
  containerStatuses:
  - image: docker.io/bitnami/postgresql:10.7.0
    imageID: ""
    lastState: {}
    name: postgres-postgresql
    ready: false
    restartCount: 0
    state:
      waiting:
    reason: PodInitializing
  hostIP: 192.168.1.217
  initContainerStatuses:
  - image: docker.io/bitnami/minideb:latest
    imageID: ""
    lastState: {}
    name: init-chmod-data
    ready: false
    restartCount: 0
    state:
      waiting:
    reason: PodInitializing
  phase: Pending
  qosClass: Burstable
  startTime: "2019-03-28T17:51:02Z"

I don't see anything obvious in these to be able to pinpoint what might be going on. And I've already rebooted the server just to see if that might help. Any thoughts? Why won't my containers start?

-- soapergem
kubernetes
kubernetes-deployment
kubernetes-helm
kubernetes-pod
kubernetes-service

2 Answers

4/3/2019

You can use the event command of kubectl. This will give you the event for your pod.

To filter for a specific pod you can use a field-selector:

kubectl get event --namespace abc-namespace --field-selector involvedObject.name=my-pod-zl6m6

-- Abhishek Soni
Source: StackOverflow

3/29/2019

It looks like your initContainer is stuck in the PodInitializing state. The most likely scenario is that your PVCs are not ready. I recommend you describe your data-postgres-postgresql-0 PVC to make sure that the volume has actually been provisioned and is in the READY state. Your NFS provisioner may be working, but that specific PV/PVC may not have been created due to an error. I have run into similar phenomena with the EFS provisioner with AWS.

Hope this helps!

-- Frank Yucheng Gu
Source: StackOverflow