GKE unexpectedly dropping connections in eu-north

5/20/2019

This is my first experience with k8s and I'm a bit disappointed.

I have the problem within the GKE network. For example, I created a Postgres pod and sometimes (I'm using Node.JS with Typeorm, but that doesn't matter) I receive error logs about lost connection.

And that happens every 1-10 minutes.

I created a simple Compute Engine instance with PostgreSQL on board. I don't have any issues when I use that instance from API inside GKE.

Same happens with Ingress.

I'm using TCP load balancer (among with nginx-ingress) and 1.13.5-gke.10 version.

What did I already try:

  • I re-created clusters in different zones: europe-north-a, europe-north-c. I'm not sure about other regions though.
  • I tried clustered / not-clustered charts of Postgres. That happens to all communication, not only Postgres.
  • I checked the pods of kube-system, they don't have any errors and running without any restarts. I didn't find any specific reasons for networking issues.

Here's kind of logs I receive (that's for nginx):

2019/05/20 10:02:51 [error] 612#612: *15687 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 10.0.0.23, server: domain.io, request: "POST / HTTP/2.0", upstream: "http://10.0.0.19:4000/", host: "domain.io:443"

What can I do? I'm a bit desperate.

UPDATE: I'm not sure in that, but once I scale deployment to 1 replica it stopped having issues. I'll keep looking into that to see if that worked.

-- blits
gke-networking
google-cloud-platform
google-kubernetes-engine
kubernetes
kubernetes-ingress

0 Answers