Unhealthy backends using GCP network endpoint groups for container-native Load Balancing

11/24/2018

we're testing out google's new container-native load balancingfeature. We followed this tutorial successfully, and we're trying to roll it out to our three services on GKE.

As far as I can tell, the only difference between the NEG feature and the legacy GCLB ingress object is the annotations in each service so the URL mapping should work the same.

We've updated all of the services to use this annotation but two out of three are Unhealthy while one is considered healthy. The only differences in the service yamls are the name and selector.

All of the deployments have health checks and are healthy when we inspect manually but the LB says the backends are unhealthy.

What are we missing?

Ingress.yaml

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: fanout-ingress
  annotations:
    kubernetes.io/ingress.global-static-ip-name: "neg-ip"
spec:
  backend:
    serviceName: frontend-svc
    servicePort: 8080
  rules:
  - host: testneg.test.com
    http:
      paths:
      - path: /*
        backend:
          serviceName: frontend-svc # Healthy service
          servicePort: 8080
      - path: /backend/*
        backend:
          serviceName: backend-svc # Unhealthy service
          servicePort: 8080
      - path: /notifications/*
        backend:
          serviceName: notifications-svc # Unhealthy service
          servicePort: 8080

--

frontend-svc.yaml - backend/notifications are same except for name and selector

apiVersion: v1
kind: Service
metadata:
  name: frontend-svc
  annotations:
    cloud.google.com/neg: '{"ingress": true}' # Creates an NEG after an Ingress is created
spec:
  selector:
    app: frontend
  ports:
  - port: 8080
    protocol: TCP
    targetPort: 8080

--

backend-deployment.yaml

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: backend
spec:
  replicas: 1
  minReadySeconds: 60
  strategy:
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 1
    type: RollingUpdate
  template:
    spec:
      containers:
        image: us.gcr.io/<OUR_DJANGO_IMAGE>
        imagePullPolicy: Always
        name: backend
        ports:
        - containerPort: 8080
          protocol: TCP
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        readinessProbe:
          tcpSocket:
            port: 8080
          initialDelaySeconds: 60
          periodSeconds: 30
          timeoutSeconds: 3
        livenessProbe:
          tcpSocket:
            port: 8080
          initialDelaySeconds: 60
          periodSeconds: 30
          timeoutSeconds: 3   
      terminationGracePeriodSeconds: 60
-- Mike
gke-networking
google-cloud-platform
google-kubernetes-engine

1 Answer

11/25/2018

your ingress yaml file show different services

- path: /*
        backend:
          serviceName: frontend-svc # Healthy service
          servicePort: 8080
      - path: /backend/*
        backend:
          serviceName: backend-svc # Unhealthy service
          servicePort: 8080
      - path: /notifications/*
        backend:
          serviceName: notifications-svc # Unhealthy service
          servicePort: 8080

Your frontend-svc.yaml has a different service name "li-frontend-svc" which is not in your ingress.

The Spec.Backend.serviceName in your ingress should match the same as your service name, the unhealty backend service is expected.

last edit:

in your ingress you specify two time the service frontend-svc, you should use the ingress spec as follow spec:

spec:
  rules:
  - http:
      paths:
      - backend:
          serviceName: first-service # Name of the Service targeted by the Ingress
          servicePort: 8080 # Should match the port used by the Service
        path: <first-service-path>/*
      - backend:
          serviceName: second-service # Name of the Service targeted by the Ingress
          servicePort: 8080 # Should match the port used by the Service
        path: <second-service-path>/*
      - backend:
          serviceName: third-service # Name of the Service targeted by the Ingress
          servicePort: 8080 # Should match the port used by the Service
        path: <third-service-path>/*

Here my reproduction:

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    run: neg-hello-1 # Label for the Deployment
  name: neg-hello-1 # Name of Deployment
spec: # Deployment's specification
  minReadySeconds: 60 # Number of seconds to wait after a Pod is created and its status is Ready
  selector:
    matchLabels:
      run: neg-hello-1
  template: # Pod template
    metadata:
      labels:
        run: neg-hello-1 # Labels Pods from this Deployment
    spec: # Pod specification; each Pod created by this Deployment has this specification
      containers:
      - image: gcr.io/google-samples/hello-app:1.0 # Application to run in Deployment's Pods
        name: neg-hello-1 # Container name
        ports:
        - containerPort: 8080 # Port used by containers running in these Pods
          protocol: TCP
        readinessProbe:
          tcpSocket:
            port: 8080
          initialDelaySeconds: 5
          periodSeconds: 10
        livenessProbe:
          tcpSocket:
            port: 8080
          initialDelaySeconds: 15
          periodSeconds: 20
      terminationGracePeriodSeconds: 60 # Number of seconds to wait for connections to terminate before shutting down Pods

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    run: neg-hello-2 # Label for the Deployment
  name: neg-hello-2 # Name of Deployment
spec: # Deployment's specification
  minReadySeconds: 60 # Number of seconds to wait after a Pod is created and its status is Ready
  selector:
    matchLabels:
      run: neg-hello-2
  template: # Pod template
    metadata:
      labels:
        run: neg-hello-2 # Labels Pods from this Deployment
    spec: # Pod specification; each Pod created by this Deployment has this specification
      containers:
      - image: gcr.io/google-samples/hello-app:2.0 # Application to run in Deployment's Pods
        name: neg-hello-2 # Container name
        ports:
        - containerPort: 8080 # Port used by containers running in these Pods
          protocol: TCP
        readinessProbe:
          tcpSocket:
            port: 8080
          initialDelaySeconds: 5
          periodSeconds: 10
        livenessProbe:
          tcpSocket:
            port: 8080
          initialDelaySeconds: 15
          periodSeconds: 20
      terminationGracePeriodSeconds: 60 # Number of seconds to wait for connections to terminate before shutting down Pods

--

apiVersion: v1
kind: Service
metadata:
  name: neg-hello-1 # Name of Service
  annotations:
    cloud.google.com/neg: '{"ingress": true}' # Creates an NEG after an Ingress is created
spec: # Service's specification
  selector:
    run: neg-hello-1 # Selects Pods labelled run: neg-hello-1
  ports:
  - port: 80 # Service's port
    protocol: TCP
    targetPort: 8080

--

apiVersion: v1
kind: Service
metadata:
  name: neg-hello-2 # Name of Service
  annotations:
    cloud.google.com/neg: '{"ingress": true}' # Creates an NEG after an Ingress is created
spec: # Service's specification
  selector:
    run: neg-hello-2 # Selects Pods labelled run: neg-hello-2
  ports:
  - port: 80 # Service's port
    protocol: TCP
    targetPort: 8080

--

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: neg-ingress
spec:
  rules:
  - http:
      paths:
      - backend:
          serviceName: neg-hello-1 # Name of the Service targeted by the Ingress
          servicePort: 80 # Should match the port used by the Service
        path: /*
      - backend:
          serviceName: neg-hello-2 # Name of the Service targeted by the Ingress
          servicePort: 80 # Should match the port used by the Service
        path: /v2/* 
-- Alioua
Source: StackOverflow