Kubernetes probes fail on Tomcat

4/11/2018

I am running a Java webapp on Tomcat in a Docker image on a Kubernetes cluster. The service runs fine, I am trying to set up the liveness and readiness probes.

I haven't found documentation on best practices wrt Tomcat and Kubernetes specifically, but I considered a HTTP GET request following the documentation to be appropriate like this:

    livenessProbe:
      failureThreshold: 3
      httpGet:
        path: /
        port: 8080
        scheme: HTTP
      initialDelaySeconds: 20
      periodSeconds: 20
      successThreshold: 1
      timeoutSeconds: 3

When accessing the Tomcat base URL, it returns a welcome page and code 200 -- OK. However, the liveness probe fails, this is in the pod description:

Events:
  Type     Reason                 Age               From               Message
  ----     ------                 ----              ----               -------
  Normal   Scheduled              3m                default-scheduler  Successfully assigned xxxxx-service-7f8f76988-lkxdf to kube-03
  Normal   SuccessfulMountVolume  3m                kubelet, kube-03   MountVolume.SetUp succeeded for volume "default-token-b6tps"
  Normal   Created                1m (x3 over 3m)   kubelet, kube-03   Created container
  Normal   Started                1m (x3 over 3m)   kubelet, kube-03   Started container
  Warning  Unhealthy              42s (x7 over 3m)  kubelet, kube-03   Liveness probe failed: Get http://10.233.96.19:8080/: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
  Normal   Pulling                40s (x4 over 3m)  kubelet, kube-03   pulling image "xxxxx:4999/xxxxx/xxxxxservice:v1.3.0"
  Normal   Pulled                 40s (x4 over 3m)  kubelet, kube-03   Successfully pulled image "xxxxx:4999/xxxxx/xxxxxservice:v1.3.0"
  Normal   Killing                40s (x3 over 2m)  kubelet, kube-03   Killing container with id docker://xxxxx-service:Container failed liveness probe.. Container will be killed and recreated.

The same goes for the readiness probe when set up in the same way. However, when I deactivate the probes, the service runs fine. I can access the Tomcat welcome page with / as well as the actual webapp.

My question is hence: how should I correctly set up Kubernetes liveness/readiness probes for a Tomcat webapp? Why does the simple HTTP GET approach fail?

Related issues seem to be due to startup times longer than what is set in the initialDelaySeconds parameter (e.g. this), just like the error message indicates. However, Tomcat and the webapp really are accessible after a few seconds in this case, so the startup time is not the issue here.

Here are the deployment specs:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: xxxxx-service
  namespace: xxxxx
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: xxxxx-service
    spec:
      imagePullSecrets:
      - name: regsecret
      containers:
      - image: xxxxxservice:v1.3.0
        imagePullPolicy: Always
        name: xxxxx-service
        ports:
        - containerPort: 8080
          protocol: TCP
        resources:
          limits:
            cpu: "0.2"
            memory: 4Gi
-- Carsten
kubernetes
tomcat

1 Answer

4/11/2018

Your configuration looks good and should work, but in the events I see:

Get http://10.233.96.19:8080/: net/http: request canceled (Client.Timeout exceeded while awaiting headers)

That means kubelet was connected to your pod, but the request was cancelled because timeout exceeded while it was waiting for headers.

So, I think the problem is in your application, it can be:

  1. Incorrect default routing rules. By default, kubelet is sending a request without any headers (include Host), so maybe application just does not know what to do with that request. Try to set a Host header like that:

    livenessProbe:
      httpGet:
        httpHeaders:
          Host: <desired.host.of.application.com>
  2. At the start, the application can work a bit slow. In that case, 3 seconds of timeout can be not enough. You can try to increase the timeoutSeconds value.

-- Anton Kostenko
Source: StackOverflow