Pod shows as "Terminating" after node is shutdown

6/2/2020

There was a pod named n404-neo4j-core-1 running on k8s-slave2. After k8s-slave2 was turned off, the pod was stuck with the Terminating.

I was expecting the pod to be deleted and a new pod be created on another node. If this problem is not resolved, the neo4j cluster failed to keep HA.

kubectl get pods -o wide
    NAME                                     READY   STATUS        RESTARTS   AGE     IP          NODE         NOMINATED NODE   READINESS GATES
    n404-neo4j-core-0                        1/1     Running       0          3d19h   ***   k8s-node1    <none>           <none>
    n404-neo4j-core-1                        1/1     Terminating   0          78m     ***   k8s-slave2   <none>           <none>

    kubectl describe pod n404-neo4j-core-1

Name:                      n404-neo4j-core-1
Namespace:                 default
Priority:                  0
PriorityClassName:         <none>
Node:                      k8s-slave2/10.176.6.67
Start Time:                Mon, 01 Jun 2020 23:53:13 -0700
Labels:                    app.kubernetes.io/component=core
                           app.kubernetes.io/instance=n404
                           app.kubernetes.io/managed-by=Helm
                           app.kubernetes.io/name=neo4j
                           controller-revision-hash=n404-neo4j-core-67484bd88
                           helm.sh/chart=neo4j-4.0.4-1
                           statefulset.kubernetes.io/pod-name=n404-neo4j-core-1
Annotations:               <none>
Status:                    Terminating (lasts 21m)
Termination Grace Period:  30s
IP:                        10.36.0.1
Controlled By:             StatefulSet/n404-neo4j-core
Containers:
  n404-neo4j:
    Container ID:  docker://a045d7747678ca62734800d153d01f634b9972b527289541d357cbc27456bf7b
    Image:         neo4j:4.0.4-enterprise
    Image ID:      docker-pullable://neo4j@sha256:714d83e56a5db61eb44d65c114720f8cb94b06cd044669e16957aac1bd1b5c34
    Ports:         5000/TCP, 7000/TCP, 6000/TCP, 7474/TCP, 7687/TCP, 3637/TCP
    Host Ports:    0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP
    Command:
      /bin/bash
      -c
      export core_idx=$(hostname | sed 's|.*-||')
      
      # Processes key configuration elements and exports env vars we need.
      . /helm-init/init.sh
      
      # We advertise the discovery-lb addresses (see discovery-lb.yaml) because
      # it is for internal cluster comms and is limited to private ports.
      export DISCOVERY_HOST="discovery-n404-neo4j-${core_idx}.default.svc.cluster.local"
      export NEO4J_causal__clustering_discovery__advertised__address="$DISCOVERY_HOST:5000"
      export NEO4J_causal__clustering_transaction__advertised__address="$DISCOVERY_HOST:6000"
      export NEO4J_causal__clustering_raft__advertised__address="$DISCOVERY_HOST:7000"
      
      echo "Starting Neo4j CORE $core_idx on $HOST"
      exec /docker-entrypoint.sh "neo4j"
      
    State:          Running
      Started:      Mon, 01 Jun 2020 23:53:14 -0700
    Ready:          True
    Restart Count:  0
    Liveness:       tcp-socket :7687 delay=300s timeout=2s period=10s #success=1 #failure=3
    Readiness:      tcp-socket :7687 delay=120s timeout=2s period=10s #success=1 #failure=3
    Environment Variables from:
      n404-neo4j-common-config  ConfigMap  Optional: false
      n404-neo4j-core-config    ConfigMap  Optional: false
    Environment:
      NEO4J_SECRETS_PASSWORD:  <set to the key 'neo4j-password' in secret 'n404-neo4j-secrets'>  Optional: false
    Mounts:
      /data from datadir (rw)
      /helm-init from init-script (rw)
      /plugins from plugins (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from n404-neo4j-sa-token-jp7g9 (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  datadir:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  datadir-n404-neo4j-core-1
    ReadOnly:   false
  init-script:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      n404-init-script
    Optional:  false
  plugins:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  n404-neo4j-sa-token-jp7g9:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  n404-neo4j-sa-token-jp7g9
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  svc=neo4j
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
-- Helen Yao
kubernetes
neo4j

3 Answers

6/2/2020

You should not down a Kubernetes node at sudden. If you do, you'll end up with some strange scenarios like this.

First, cordon the node. It notifies the scheduler the given node is not available for scheduling anymore.

kubectl cordon <node>

Then, drain the node. It moves the running pods to another node/nodes.

kubectl drain <node>

Now, you're safe to remove the node from the cluster.

-- hariK
Source: StackOverflow

6/3/2020

This is so called 'at-most-one' semantic in K8S, pls check the link: https://v1-16.docs.kubernetes.io/docs/tasks/run-application/force-delete-stateful-set-pod/

Copied from the link: StatefulSet ensures that, at any time, there is at most one Pod with a given identity running in a cluster. This is referred to as at most one semantics provided by a StatefulSet.

-- xinchen
Source: StackOverflow

6/2/2020

From the docs here

Kubernetes (versions 1.5 or newer) will not delete Pods just because a Node is unreachable. The Pods running on an unreachable Node enter the ‘Terminating’ or ‘Unknown’ state after a timeout. Pods may also enter these states when the user attempts graceful deletion of a Pod on an unreachable Node. The only ways in which a Pod in such a state can be removed from the apiserver are as follows:

  1. The Node object is deleted (either by you, or by the Node Controller).
  2. The kubelet on the unresponsive Node starts responding, kills the Pod and removes the entry from the apiserver.
  3. Force deletion of the Pod by the user.

The recommended best practice is to use the first or second approach. If a Node is confirmed to be dead (e.g. permanently disconnected from the network, powered down, etc), then delete the Node object. If the Node is suffering from a network partition, then try to resolve this or wait for it to resolve. When the partition heals, the kubelet will complete the deletion of the Pod and free up its name in the apiserver. Normally, the system completes the deletion once the Pod is no longer running on a Node, or the Node is deleted by an administrator. You may override this by force deleting the Pod.

-- Arghya Sadhu
Source: StackOverflow