This sounds weird and we have no idea what is happening here . We were using Bernard queue to poll sqs message and then as soon as the message is there on a queue workers pick up the job and spawn the child process . This was working fine when we were in ECS . After moving to Kubernetes workers hung randomly after 12-24 hours
We tried with 1 GB memory allocation , no luck
we also thought the process went zombie and tried out https://github.com/krallin/tini
Still no luck
What is the best way to tackle this?