I have a container running nginx and it listens on port 443 of the pod id. It runs fine by itself; however, if I specify a liveness probe, then the probe will fail with
5m54s Warning Unhealthy Pod Liveness probe failed: Get https://192.168.2.243:443/: EOF
Can someone please and point out what I've done wrong? Thanks.
When it is running without the liveness probe:
root@ip-192-168-2-243:/etc/nginx# netstat -tupln | grep 443
tcp 0 0 192.168.2.243:1443 0.0.0.0:* LISTEN -
tcp 0 0 192.168.2.243:443 0.0.0.0:* LISTEN 7/nginx: master pro
root@ip-192-168-2-243:/# telnet 192.168.2.243 443
Trying 192.168.2.243...
Connected to 192.168.2.243.
Escape character is '^]'.
^]
telnet> quit
Connection closed.
root@ip-192-168-2-243:/# curl https://192.168.2.243
curl: (77) error setting certificate verify locations:
CAfile: /etc/ssl/certs/ca-certificates.crt
CApath: /etc/ssl/certs
Probe declaration:
livenessProbe:
initialDelaySeconds: 10
timeoutSeconds: 4
failureThreshold: 3
httpGet:
scheme: HTTPS
port: 443
Nginx split client declaration:
split_clients "${remote_addr}AAA" $localips {
* 192.168.2.243;
}
Events:
skwok-mbp:kubernetes skwok$ kubectl get event -w
LAST SEEN TYPE REASON OBJECT MESSAGE
7s Normal SuccessfulDelete statefulset/mnsvr delete Pod mnsvr-0 in StatefulSet mnsvr successful
0s Normal Killing pod/mnsvr-0 Killing container with id docker://mnsvr-proxy:Need to kill Pod
0s Normal Killing pod/mnsvr-0 Killing container with id docker://mnsvr-node0:Need to kill Pod
0s Normal Killing pod/mnsvr-0 Killing container with id docker://mnsvr-node1:Need to kill Pod
0s Normal SuccessfulCreate statefulset/mnsvr create Pod mnsvr-0 in StatefulSet mnsvr successful
0s Normal Scheduled pod/mnsvr-0 Successfully assigned staging/mnsvr-0 to ip-192-168-2-243.us-west-2.compute.internal
0s Normal Pulled pod/mnsvr-0 Container image "171421899218.dkr.ecr.us-west-2.amazonaws.com/mnsvr-proxy:0.96" already present on machine
0s Normal Created pod/mnsvr-0 Created container
0s Normal Started pod/mnsvr-0 Started container
0s Normal Pulled pod/mnsvr-0 Container image "171421899218.dkr.ecr.us-west-2.amazonaws.com/mnsvr:1.1" already present on machine
0s Normal Created pod/mnsvr-0 Created container
0s Normal Started pod/mnsvr-0 Started container
0s Normal Pulled pod/mnsvr-0 Container image "171421899218.dkr.ecr.us-west-2.amazonaws.com/mnsvr:1.1" already present on machine
0s Normal Created pod/mnsvr-0 Created container
0s Normal Started pod/mnsvr-0 Started container
0s Warning Unhealthy pod/mnsvr-0 Liveness probe failed: Get https://192.168.2.243:443/: EOF
0s Warning Unhealthy pod/mnsvr-0 Liveness probe failed: Get https://192.168.2.243:443/: EOF
0s Warning Unhealthy pod/mnsvr-0 Liveness probe failed: Get https://192.168.2.243:443/: EOF
0s Normal Killing pod/mnsvr-0 Killing container with id docker://mnsvr-proxy:Container failed liveness probe.. Container will be killed and recreated.
0s Normal Pulled pod/mnsvr-0 Container image "171421899218.dkr.ecr.us-west-2.amazonaws.com/mnsvr-proxy:0.96" already present on machine
0s Normal Created pod/mnsvr-0 Created container
0s Normal Started pod/mnsvr-0 Started container
0s Warning Unhealthy pod/mnsvr-0 Liveness probe failed: Get https://192.168.2.243:443/: EOF
0s Warning Unhealthy pod/mnsvr-0 Liveness probe failed: Get https://192.168.2.243:443/: EOF
0s Warning Unhealthy pod/mnsvr-0 Liveness probe failed: Get https://192.168.2.243:443/: EOF
0s Normal Killing pod/mnsvr-0 Killing container with id docker://mnsvr-proxy:Container failed liveness probe.. Container will be killed and recreated.
0s Normal Pulled pod/mnsvr-0 Container image "171421899218.dkr.ecr.us-west-2.amazonaws.com/mnsvr-proxy:0.96" already present on machine
0s Normal Created pod/mnsvr-0 Created container
0s Normal Started pod/mnsvr-0 Started container
0s Warning Unhealthy pod/mnsvr-0 Liveness probe failed: Get https://192.168.2.243:443/: EOF
0s Warning Unhealthy pod/mnsvr-0 Liveness probe failed: Get https://192.168.2.243:443/: EOF
0s Warning BackOff pod/mnsvr-0 Back-off restarting failed container
Kubernetes has two separate ways to track the health of a pod, one during deployment, and one after. LivenessProbe is what causes Kubernetes to replace a failed pod with a new one, but it has absolutely no effect during deployment of the app. Readiness probes, on the other hand, are what Kubernetes uses to determine whether the pod started successfully.
So in your case when your container worked successfuly, you have to define readinessProbe.
Sometimes, applications are temporarily unable to serve traffic. For example, an application might need to load large data or configuration files during startup, or depend on external services after startup. In such cases, you don’t want to kill the application, but you don’t want to send it requests either. Kubernetes provides readiness probes to detect and mitigate these situations. A pod with containers reporting that they are not ready does not receive traffic through Kubernetes Services.
Official kubernetes documentation which describe probes: kubernetes-probes.
Here is useful article: kubernetes-liveness-and-readiness-probes.
I think the EOF is a symptom of a TLS handshake issue. I'm currently seeing the same.
Some versions of curl can produce a similar result. A workaround for curl seems to be to use --tls-max 1.2.
My current suspicion is that the client (the probe) is trying to negotiate TLS 1.3 with the server but fails (probably due to ciphers). I'm trying to see if we can configure the k8s probes to use TLS 1.2 instead. Alternately, we could turn off TLS 1.3 on the server side. In your case that's on nginx. In my case, I have a jetty 9.4 server with JDK 11.0.6.
Another option might be to upgrade k8s. We seem to see this with k8s v1.15 cluster but not with a k8s v1.16.2 cluster. But I'm not sure if that's due to the k8s version or the underlying OS libraries (in my case CentOS 7).