I was running my kafka-connect on two ec2 machines. So irrespective of number of tasks, these two machines would always stay up running tasks. Hence under used machines. Recently I migrated kafka-connect on kubernetes. I achieved good cpu/memory efficiency.
But the problem arises when downscaling of kubernetes happens. Downscaling of pods does not happen gracefully.
Eg. Suppose there are 2 pods p1 and p2. p1 is running 3 tasks t1,t2,t3 p2 is running 2 tasks t4,t5 (here task t5 is task for source connector that brings data from postgres to kafka)
When any pod vanishes during downscaling, tasks running on it are rebalanced on other pods. Suppose pod p2 vanishes.
After task rebalancing new state of cluster is:- P1 is running 5 tasks t1,t2,t3,t4_new,t5_new
But logs for my source connector says that some other task(presumably task running on older pod t5) is still running and accessing postgres db data.
How can i make sure whenever pod downscales, it happens gracefully in the sense that all task running on pod are stopped.
just increase shutdown timeout "task.shutdown.graceful.timeout.ms" in kafka-connect config. Also use preStop hook pods hooks and use some sleep command in hook so that kubelet wait for hook to complete and run SIGTERM after that.
It could be that the default grace period
is not enough for you a aplication to finish its tasks after recevied SIGTERM
singal.
SIGTERM
signal is sent the the main process in the container and once the signal is recevied container should start a graceful shutdown of the running application and exit.
There is a very good explanation/flow described in kubernetes official documentation about Termination of Pods
.
You could try to extend the terminationGracePeriodSeconds
in your deployment to see if that helps (The default is 30):
spec:
replicas:
template:
spec:
containers:
- name: test
image: ...
terminationGracePeriodSeconds: 60
The other way is to use preStop
hook. preStop
hook is executed immediately before a container is terminated. How it works is when container needs to be terminated, Kubelet will run the pre-stop hook and only then send SIGTERM
to the process. This can be used to initate a graceful shutdown of the container.
It can be also used to perform some other operations before shutdown without having to implement those in the app itself.
This is an simple example how it works (It is a HTTP GET
request that will be sent to `http:///shutdown):
lifecycle:
preStop:
httpGet:
port: 80
path: shutdown
Here is also link to kubernetes documentation about Container hooks
. Let me know if this was heplful.