kafka-2-1-4 stopped working on deployment server

10/7/2019

I am trying to understand the reason why an existing application which uses kafka-2-1-4 deployed on aws and was working well stopped sending kafka messages out. During the analysis of the issue, I cam across this error message on the kubernetes dashboard. From the error I understand that one of three kafka brokers has stopped/failed and it's failing to restart the container again [Please correct me if my understanding is wrong as I am new to kafka]. I am trying to figure out why the restart for stopped containers failed and also, in the first place why did it fail?what needs to be fixed?

I already tried googling the error but couldn't find anything specific. Although restating kafka server fixed it but I would like to understand the issue here. Please feel free to ask, if I have missed any required information.

Here is the error from kubernetes dashboard

Liveness probe failed: CRITICAL: 48 partition(s) have the number of replicas in sync that is lower than the specified min ISR.
Exec lifecycle hook(bash -c kill -s TERM 1; while $(kill -0 1 2>/dev/null(; do sleep 1; done]) for Container "broker" in Pod "kafka-2-1-4-0_kafka(a2e66409-e3c9-11e9-9584-02169d70a228)" failed - error:command 'bash -c kill -s TERM 1; while $(kill -0 1 2>/dev/null); do sleep 1; done' exited with 137:, message:""
Back-off restarting failed container
-- Debi
apache-kafka
kubernetes

0 Answers