Is there a way to know why Kubernetes is terminating pods?
If I go to Logging
in the Google console, the only message I can find related to this event is:
shutting down, got signal: Terminated
Also, the pods in status Terminating
are never being terminated, a few of them are in this status for more than 24 hours now.
I'm not using livenessProbes
or readinessProbes
.
I am using terminationGracePeriodSeconds: 30
EDIT: added the result of kubectl describe pod <podname>
for the pod that is the Terminating
status for 9 hours as of now:
Name: storeassets-5383k
Namespace: default
Node: gke-recommendation-engin-default-pool-c9b136a8-0qms/10.132.0.85
Start Time: Sat, 11 Mar 2017 06:27:32 +0000
Labels: app=storeassets
deployment=ab08dc44070ffbbceb69ff6a5d99ae61
version=v1
Status: Terminating (expires Tue, 14 Mar 2017 01:30:48 +0000)
Termination Grace Period: 30s
Reason: NodeLost
Message: Node gke-recommendation-engin-default-pool-c9b136a8-0qms which was running pod storeassets-5383k is unresponsive
IP: 10.60.3.7
Controllers: ReplicationController/storeassets
Containers:
storeassets:
Container ID: docker://7b38f1de0321de4a5f2b484f5e2263164a32e9019b275d25d8823de93fb52c30
Image: eu.gcr.io/<project-name>/recommendation-content-realtime
Image ID: docker://sha256:9e8cf1b743f94f365745a011702a4ae1c2e636ceaaec4dd8d36fef6f787aefe7
Port:
Command:
python
-m
realtimecontent.storeassets
Requests:
cpu: 100m
State: Running
Started: Sat, 11 Mar 2017 06:27:33 +0000
Ready: True
Restart Count: 0
Volume Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-qwfs4 (ro)
Environment Variables:
RECOMMENDATION_PROJECT: <project-name>
RECOMMENDATION_BIGTABLE_ID: recommendation-engine
GOOGLE_APPLICATION_CREDENTIALS: recommendation-engine-credentials.json
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
Volumes:
default-token-qwfs4:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-qwfs4
QoS Class: Burstable
Tolerations: <none>
No events.
As for why the pods are getting terminated, it must be because your image/container is exiting with a successful status.
Try logging your pod until it exits. You might be able to see the reason why from there.