I have a kubernetes cluster where one service (java application) connects to another service to write data (elasticsearch).
When elasticsearch (service & replicationcontroller) is restarted/redeployed, the java-application looses it's connection, which can only be recovered by restarting the java-application (rc). This is not the desired behaviour and should be solved.
Using curl from the kubernetes pod of the application to query elasticsearch does work fine after restart, so it must be probably something java is doing.
It does work when only the replicationcontroller for elasticsearch is touched, leaving the service as it is. But why does curl work in that case, however this should not be the solution.
Using the same konfiguration in a local docker setup without kubernetes does also not lead to problems.
Promising solutions that did not worked:
networkaddress.cache.ttl
or networkaddress.cache.negative.ttl
to zero (or other small positive values)/etc/nsswitch.conf
as described in https://stackoverflow.com/a/32550032/363281I'm using kubernetes 1.1.3, OpenJDK 8u66, service Dockerfile is derived from java:8
Try java.security.Security.setProperty("networkaddress.cache.ttl" , "60");
This means sixty seconds and you should adapt to your needs.
One solution is not to restart your Service: a Service resolves the Pods by IPs and watches the Pods by selectors, so you don't need to restart the Service when you restart your Pods.
Now likely what is happening is that your application is resolving the Service at start up, and it then caches the IP. When you restart the Service it likely gets a new IP which messes up your application's behavior. You need to check how you can reset this cache or initiate some sort of restart of that App when the pods/services are changes.
If you don't restart the Service, the IP won't change, but it will still proxy to the Pods that are restarted.