Strange behavior of Kubernetes + AWS ELB

6/16/2018

I'm currently having some strange behavior regarding Kubernetes and AWS ELB. I'm doing some tests in a (non productive) EKS Cluster and something weird happens:

I have a Traefik Ingress controller deployed as a simple Deployment, and a Service of type Load Balancer with the appropiate selector (pointing to the Ingress Controller). The Load Balancer is indeed created in AWS (and I can successfully enter the Traefik dashboard by its public DNS).

The problem is that I have two EC2 Instances (the two of them are worker nodes) in the Kubernetes Cluster and only one of them is Healthy (the other one is marked as OutOfService), but if I curl the instance in the healthcheck port, It returns 200.

Now, if I scale the deployment of the Traefik Ingress Controller to 2 replicas, both instances appear as in Service. The rare thing is that when there was one replica, the pod was scheduled in the Instance that was Out of Service.

I know that in a production cluster I would deploy my Ingress Controller as a DeamonSet, but I think that it should work as a Deployment, as NodePorts are allocated on all Nodes, regardless the pod is scheduled in that node or not. Also, the Healthy node was the one which wasn't running the Pod, so I don't think that was the problem either.

If I scale the deployment to 1 again, the instance goes to OutOfService inmediately.

My service configuration (if helps) is as follows:

---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: traefik-ingress-controller
  namespace: kube-system
---
kind: Deployment
apiVersion: extensions/v1beta1
metadata:
  name: traefik-ingress-controller
  namespace: kube-system
  labels:
    k8s-app: traefik-ingress-lb
spec:
  replicas: 1
  selector:
    matchLabels:
      k8s-app: traefik-ingress-lb
  template:
    metadata:
      labels:
        k8s-app: traefik-ingress-lb
        name: traefik-ingress-lb
    spec:
      serviceAccountName: traefik-ingress-controller
      terminationGracePeriodSeconds: 60
      containers:
      - image: traefik
        name: traefik-ingress-lb
        readinessProbe:
          httpGet:
            path: /
            port: 8080
        ports:
        - name: http
          containerPort: 80
        args:
        - --api
        - --kubernetes
        - --logLevel=INFO
---
kind: Service
apiVersion: v1
metadata:
  name: traefik-ingress-service
  namespace: kube-system
  annotations:
    external-dns.alpha.kubernetes.io/hostname: test.xxxxxxx.com.
    service.beta.kubernetes.io/aws-load-balancer-ssl-cert: "arn:aws:acm:xxxxxxx" 
    service.beta.kubernetes.io/aws-load-balancer-backend-protocol: "http"
    service.beta.kubernetes.io/aws-load-balancer-ssl-ports: "https"
spec:
  selector:
    k8s-app: traefik-ingress-lb
  ports:
    - name: http
      port: 80
      protocol: TCP
      targetPort: http
    - name: https
      port: 443
      protocol: TCP
      targetPort: http
  type: LoadBalancer

Thanks in advance!

-- Santiago Ignacio Poli
amazon-ec2
amazon-web-services
aws-load-balancer
kubernetes

0 Answers