I have a scenario where it is required to 'prepare' Kubernetes towards taking off/terminating/shutdown a container, but allow it to serve some requests till that happens.
For example, lets assume that there are three methods: StartAction, ProcessAction, EndAction. I want to prevent clients from invoking StartAction when a container is about to be shutdown. However they should be able use ProcessAction and EndAction on that same container (after all Actions have been completed, the container will shutdown).
I was thinking that this is some sort of 'dual' readiness probe, where I basically want to indicate a 'not ready' status but continue to serve requests for already started Actions.
I know that there is a PreStop hook but I am not confident that this serves the need because according to the documentation I suspect that during the PreStop the pod is already taken off the load balancer:
- (simultaneous with 3) Pod is removed from endpoints list for service, and are no longer considered part of the set of running Pods for replication controllers. Pods that shutdown slowly cannot continue to serve traffic as load balancers (like the service proxy) remove them from their rotations.
(https://kubernetes.io/docs/concepts/workloads/pods/pod/#termination-of-pods).
Assuming that I must rely on stickiness and must continue serving requests for Actions on containers where those actions were started, is there some recommended practice?
I think you can just implement 2 endpoints in your application:
So to make graceful shutdown I think you should firstly call "Shutdown preparation endpoint" which will cause that "Custom readiness probe" will return error so Kubernetes will get out that POD from service load balancer (no new clients will come) but existing TCP connections will be kept (existing clients will operate). After your see in some custom metrics (which your service should provide) that all actions for clients are done you should shutdown containers using standard Kubernetes actions. All those actions should be probably automated somehow using Kubernetes and your application APIs.