java/Kubernetes in google container engine - unknownhostexception

1/8/2017

Im working in a service to consume from Kafka and write to BigQuery. Said service will be running in Kubernetes on Google Container Engine.

For the most part this process works but about 20% of my launched containers will fail when attempting to authenticate with Google. Error message:

java.net.UnknownHostException: accounts.google.com
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:184)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at sun.security.ssl.SSLSocketImpl.connect(SSLSocketImpl.java:668)
at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
at sun.net.www.protocol.https.HttpsClient.<init>(HttpsClient.java:264)
at sun.net.www.protocol.https.HttpsClient.New(HttpsClient.java:367)
at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.getNewHttpClient(AbstractDelegateHttpsURLConnection.java:191)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1138)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1032)
at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:177)
at sun.net.www.protocol.http.HttpURLConnection.getOutputStream0(HttpURLConnection.java:1316)
at sun.net.www.protocol.http.HttpURLConnection.getOutputStream(HttpURLConnection.java:1291)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.getOutputStream(HttpsURLConnectionImpl.java:250)
at com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:77)
at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:981)
at com.google.api.client.auth.oauth2.TokenRequest.executeUnparsed(TokenRequest.java:283)
at com.google.api.client.auth.oauth2.TokenRequest.execute(TokenRequest.java:307)
at com.google.api.client.googleapis.auth.oauth2.GoogleCredential.executeRefreshToken(GoogleCredential.java:384)
at com.google.api.client.auth.oauth2.Credential.refreshToken(Credential.java:489)
at com.google.api.client.auth.oauth2.Credential.intercept(Credential.java:217)
at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:868)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:419)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:352)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:469)

Ive tried to pause for a few seconds and retry (loop this 3-4x) but this doesn't seem to make any difference.

I don't think it's an issue with my containers as 70-80% of them are working ok. There are just these few that seem to fail to initialize properly or get into some unrecoverable state.

Is there anything to be done about this? Some way to 'fail' the container from inside so kubernetes will restart it?


Edit:

To 'fail' a kubernetes container (at least in Java) just return non-zero from the process. For Java System.exit(2);.


Edit:

The problem seems to be strongly tied to individual kubernetes server-hosts and not necc containers. I found this reference that seems tied to it but I can't figure out how to restart the docker process on these systems.

-- ethrbunny
google-kubernetes-engine
java
kubernetes

1 Answer

1/24/2017

I've had errors like that on my localhost as well and usually running gcloud container clusters get-credentials again or rebooting fixes the problem.

I still haven't figured out what the root cause is, but maybe you can try rebooting your nodes.

The pods that fail, do they all run on the same node(s)?

-- Christiaan
Source: StackOverflow