I have a site hosted in Kubernetes which always returns a HTTP 200 response even when it fails to pull configuration values from a configuration repository that is hosted elsewhere. What happens is that the nodes on which the container is hosted reboot after OS patching while the configuration repo nodes is still being rebooted. Container nodes come up first, containers start up but fail to get the configuration values. The site always returns 200 with a blank page. Therefore, liveness probe using GET doesn't see an issue and container is not restarted later, failing to get the config values once cnfg repo node is up. Is there a custom liveness probe I can write which continues to restart the container until it successfully gets the config from the repo once config repo node comes back online?
I tried setting up a readiness probe but it functions the same way as site continues to respond with 200 code even when it can't launch due to config being absent.
Yes you can define a command based liveness probe, which you should implement yourself.
This already was mentioned by @Akın Özer, you can use Liveness command. And for example cat
the configuration file that you are loading, this might look like the following:
livenessProbe:
exec:
command:
- cat
- /tmp/config-file-repo
initialDelaySeconds: 5
periodSeconds: 5
The
periodSeconds
field specifies that the kubelet should perform a liveness probe every 5 seconds. TheinitialDelaySeconds
field tells the kubelet that it should wait 5 second before performing the first probe.
You can use this with Lifecycle Hooks. To be more exact:
PostStart
This hook executes immediately after a container is created. However, there is no guarantee that the hook will execute before the container ENTRYPOINT. No parameters are passed to the handler.
You can check if container is able to load configuration repo and if it is, create empty file /tmp/config-file-repo
. This way your liveness probe will know if container should be rescheduled or not.
An example for postStart
might be:
lifecycle:
postStart:
exec:
command:
- "sh"
- "-c"
- >
if curl --fail -X GET http://configuration_repo_nodes ;then
touch /tmp/config-file-repo;
else
sleep 60;
fi
This checks if configuration_repo_nodes
is accessible and creates file /tmp/config-file-repo
, if inaccessible sleep 60
. You can write something else instead of that.