I have a k8s cluster. Our service is queue based. Our pod subscribe to an event queue,fetch event and do tasks. So for this kind of service, how to define k8s liveness probe and readiness probe?
Following is a very brief introduction to these probes:
Liveliness Probe is for the Kubernetes to know if a workload is healthy. It could be a shell command executed in your container or a simple tcp/http request which should respond positively.
If a liveliness check fails after a period of timeout which is specified in the pod config, Kubrenetes will restart the workload.
So, if your workload is doing time consuming processes, you might need to give your liveliness probe enough time to make sure that your pod is not restarted unduely.
Rediness Probe is for the Kubernetes proxy to decide if your workload is ready for consuming traffic. The traffic will be sent to your pod only if the rediness probe responds positively. So, if your workload needs more time processing a single request and needs other requests to be diverted to other replicas for fast processing during this time, you might want to give a slightly high rediness interval to the workloads.
These probe parameters, combined with number of replicas can ensure fast and healthy functioning of your application. It is very important to understand the area each probe cover and the parameters you can tune them by.
Here are some reads:
Assuming your problem is because is a processing worker consuming queue messages it doesn't expose any port to check.
In that case, you can define livenessProbe
and readinessProbe
custom command, next is an example from the documnetation:
apiVersion: v1
kind: Pod
metadata:
labels:
test: liveness
name: liveness-exec
spec:
containers:
- name: liveness
image: k8s.gcr.io/busybox
args:
- /bin/sh
- -c
- touch /tmp/healthy; sleep 30; rm -rf /tmp/healthy; sleep 600
livenessProbe:
exec:
command:
- cat
- /tmp/healthy
initialDelaySeconds: 5
periodSeconds: 5
readinessProbe:
exec:
command:
- cat
- /tmp/healthy
initialDelaySeconds: 5
periodSeconds: 5
Also, take in mind the time your process takes to be live and ready to adjust the initialDelaySeconds
and periodSeconds
to not kill the pod before it is fully loeaded.