Kubernetes services cannot reach each other anymore

12/7/2018

I’m running Kubernetes on GKE, this was working before but about 2 days ago something changed. I don’t think I changed anything to my configuration. My services do not seem to work anymore. None of my services can talk to each other. When SSHing into a running pod I cannot ping them via their service name but also not via their internal IP addresses. The external IP of the load balancer is not approachable. Here is an example of how I define the deployment:

apiVersion: apps/v1beta1
kind: Deployment
metadata:
  labels:
    ksonnet.io/component: app-name
  name: app-name
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: app-name

And here the service:

apiVersion: v1
kind: Service
metadata:
  labels:
    ksonnet.io/component: app-name
  name: app-name
spec:
  loadBalancerIP: x.x.x.x
  ports:
  - port: 4999
    targetPort: 5000
  selector:
    app: app-name
  type: LoadBalancer

I am fairly new to Kubernetes and networking and I have no clue where to look or how to debug this issue.

EDIT:

Here are the relevant kubectl get services -n test

dashboard       ClusterIP      10.47.242.176   <none>        5000/TCP         1h
app-name        LoadBalancer   10.47.246.63    x.xxx.xx.xx   4999:31439/TCP   1h

Then here is the kubectl describe service app-name -n test

Name:                     app-name
Namespace:                test
Labels:                   app.kubernetes.io/deploy-manager=ksonnet
                          ksonnet.io/component=app-name
Annotations:              ksonnet.io/managed: {pristine...}
Selector:                 app=app-name
Type:                     LoadBalancer
IP:                       10.47.246.63
IP:                       xx.xxx.xx.x
LoadBalancer Ingress:     xx.xxx.xx.x
Port:                     <unset>  4999/TCP
TargetPort:               5000/TCP
NodePort:                 <unset>  31439/TCP
Endpoints:                10.44.1.141:5000
Session Affinity:         None
External Traffic Policy:  Cluster
Events:                   <none>

EDIT 2: I tried the curl command on the default port and it timed out:

curl: (7) Failed to connect to app-name port 80: Connection timed out

When trying it on the full endpoint it got a connection refused:

curl: (7) Failed to connect to app-name port 4999: Connection refused

When looking at the deployment I get the following pod template:

Pod Template:
  Labels:  app=app-name
  Containers:
   model-manager:
    Image:      gcr.io/ns-delay/app-name:0.1
    Port:       5000/TCP
    Host Port:  0/TCP
-- Jan van der Vegt
google-kubernetes-engine
kubernetes
networking

1 Answer

12/7/2018

As i see your selector in service is not matching the labels in Deployment , change to

metadata: labels: app: app-name

in your Deployment and it should work then.

-- fatcook
Source: StackOverflow