I have a cronjob runs every 30 minutes that runs statefulset which deploys 10 replicas. Once work is done, deleting the statefuleset in 5 minutes which is deleting all the pods. However, sometime pods remain in terminating
state and could not get deleted. May be because of the node on which it is scheduled goes in "notReady" state. Thats OK.
The strange behaviour I observed here is that,
Lets say I have statefulset with name abc
and 10
replicas then if pod with name abc-3
stuck in terminating
state and I deployed statefulset again then this time only pods with name abc-0
, abc-1
, abc-2
are get deployed and not beyond that. I observed this behaviour whenever a pod with index x
stuck in terminating
state then successive deployments of statefulset are not deploying the pods starting with index x+1
upto the end as long as the terminating pod remains in terminating
state.
Steps to reproduce-
1. deploy statefulset with 10 replicas.
2. delete sts.
3. let one pod remain in terminating state ( may be take one node down or any other way ). Note down pod -index number say x
.
4. deploy sts again.
5. check if it is deploying the pods beyond x
.