I have a Kafka consumer (minimal customizations from the default Spring Kafka Config). During load tests my consumer commits suicide by completely saturating itself to the point where the (10 second timeout) liveness probe request from Kubernetes cannot respond in a timely manner.
Kubernetes then assumes a dead pod (it's working fine and processing messages) and begins an endless loop of killing pods and restarting them until the batch of records are consumed. It's so bad that Kubernetes doesn't even attempt to scale the number of pods which is about 20x the number of its minimum.
What are my options? Most of the research I have done on this subject has seemingly contradicting information. Spring recommends keeping the management endpoints on the same web server (which makes complete sense). If I look up rate limiting between my consumer and Kafka most of the answers seem to be puzzled about why you would want to do that & not a lot of build in support for this situation which concerns me.
Am I missing a key point somewhere?