We have an application deployed on Kubernetes. One of the services is an app running in the JVM.
Our application is faulty, it consumes to much memory. We are hitting the limit set in the Replication Controller, which makes it restart the pod.
Is it a good idea to use the replication controller for this? Or is it a better idea to limit the memory on the JVM (set it to something below the replication controller limit) and use something else inside the pod to restart our application?
If the JVM would stop with an out-of-memory exception, I could use memory dumps written by the JVM. Now I'm blind as to what occupied the memory.
Thank you for your reply!
In fact this is not a responsibility of Deployment / ReplicationController. The restarts are handled within Pod it self. As for handling memory, this is not a trivial issue and you should control it on both levels (things like heap sizes etc.) so that your app avoids hitting the limit, but if it does go awall, limits on pod will handle it, and that is perfectly ok (specially if you run more then one pod so you have HA).
The thing here is, that is is a bit tricky to tune memory limits well, and probably can be done best with trial and error + monitoring pod/container metrics (Prometheus/Grafana to the rescue :) ).