I'v got microservices deployed on GKE, with Helm v3; all apps/helms stood nicely for months, but yesterday for some reason pods were re-created
kubectl get pods -l app=myapp
NAME READY STATUS RESTARTS AGE
myapp-75cb966746-grjkj 1/1 Running 1 14h
myapp-75cb966746-gz7g7 1/1 Running 0 14h
myapp-75cb966746-nmzzx 1/1 Running 1 14h
the helm3 history myapp
shows it was updated 2days ago (40+hrs), not yesterday (so I exclude possibility someone simply run helm3 upgrade ..
; (seems like someone ran a command kubectl rollout restart deployment/myapp
), any thoughts how can I check why the pods were restarted? I'm not sure how to verify it; PS: the logs from kubectl logs deployment/myapp
go back only to 3 hours ago
just for reference, I'm not asking for this command kubectl logs -p myapp-75cb966746-grjkj
, with that there is no problem, I want to know what happened to the 3 pods that were there 14 hrs ago, and were simply deleted/replaced - and how to check that.
also no events on the cluster
MacBook-Pro% kubectl get events
No resources found in myns namespace.
as for describing the deployment all there is, is that first the deployment was created few months ago
CreationTimestamp: Thu, 22 Oct 2020 09:19:39 +0200
and that last update was >40hrs ago
lastUpdate: 2021-04-07 07:10:09.715630534 +0200 CEST m=+1.867748121
here is full describe if someone wants
MacBook-Pro% kubectl describe deployment myapp
Name: myapp
Namespace: myns
CreationTimestamp: Thu, 22 Oct 2020 09:19:39 +0200
Labels: app=myapp
Annotations: deployment.kubernetes.io/revision: 42
lastUpdate: 2021-04-07 07:10:09.715630534 +0200 CEST m=+1.867748121
meta.helm.sh/release-name: myapp
meta.helm.sh/release-namespace: myns
Selector: app=myapp,env=myns
Replicas: 3 desired | 3 updated | 3 total | 3 available | 0 unavailable
StrategyType: RollingUpdate
MinReadySeconds: 5
RollingUpdateStrategy: 25% max unavailable, 1 max surge
Pod Template:
Labels: app=myapp
Annotations: kubectl.kubernetes.io/restartedAt: 2020-10-23T11:21:11+02:00
Containers:
myapp:
Image: xxx
Port: 8080/TCP
Host Port: 0/TCP
Limits:
cpu: 1
memory: 1G
Requests:
cpu: 1
memory: 1G
Liveness: http-get http://:myappport/status delay=45s timeout=5s period=10s #success=1 #failure=3
Readiness: http-get http://:myappport/status delay=45s timeout=5s period=10s #success=1 #failure=3
Environment Variables from:
myapp-myns Secret Optional: false
Environment:
myenv: myval
Mounts:
/some/path from myvol (ro)
Volumes:
myvol:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: myvol
Optional: false
Conditions:
Type Status Reason
---- ------ ------
Progressing True NewReplicaSetAvailable
Available True MinimumReplicasAvailable
OldReplicaSets: <none>
NewReplicaSet: myapp-75cb966746 (3/3 replicas created)
Events: <none>
You can use
kubectl describe pod your_pod_name
where in Containers.your_container_name.lastState you get time and reason why your last pod was terminated (for example, due to error or due to being OOMKilled)
doc reference:
kubectl explain pod.status.containerStatuses.lastState
KIND: Pod
VERSION: v1
RESOURCE: lastState <Object>
DESCRIPTION:
Details about the container's last termination condition.
ContainerState holds a possible state of container. Only one of its members
may be specified. If none of them is specified, the default one is
ContainerStateWaiting.
FIELDS:
running <Object>
Details about a running container
terminated <Object>
Details about a terminated container
waiting <Object>
Details about a waiting container
Example on one of my containers, which terminated due to error in application:
Containers:
my_container:
Last State: Terminated
Reason: Error
Exit Code: 137
Started: Tue, 06 Apr 2021 16:28:57 +0300
Finished: Tue, 06 Apr 2021 16:32:07 +0300
To get previous logs of your container (the restarted one), you may use --previous
key on pod, like this:
kubectl logs your_pod_name --previous
I suggest you run kubectl describe deployment <deployment-name>
and kubectl describe pod <pod-name>
.
In addition, kubectl get events
will show cluster-level events and may help you understand what happened.
First thing first, I would check nodes on which the Pods were running.
The most common case for Pods to be created automatically, is that the node / nodes were the Pods were executing on had a problem. If a node becomes NotReady, even for a small amount of time, Kubernetes Scheduler will try to schedule new Pods on other nodes in order to match the desired state (number of replicas and so on)
Old Pods on a NotReady node will go into Terminating state and will be forced to terminate as soon as the NotReady node will become Ready again (if they are still up and running)
This is described in details in the documentation ( https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-lifetime )
If a Node dies, the Pods scheduled to that node are scheduled for deletion after a timeout period. Pods do not, by themselves, self-heal. If a Pod is scheduled to a node that then fails, the Pod is deleted; likewise, a Pod won't survive an eviction due to a lack of resources or Node maintenance. Kubernetes uses a higher-level abstraction, called a controller, that handles the work of managing the relatively disposable Pod instances.