Kubernetes: external service is not available from all minions on Azure cloud

4/4/2015

I've got the following cluster of 3 Ubuntu machines on Azure cloud:

172.16.0.7 (master)
172.16.0.4 (kube-01)
172.16.0.5 (kube-02)

On 172.16.0.4 (kube-01) I've got a pod called publisher with port 8080 exposed. To make it available to the world I defined the following service:

"id": "publisher-service",
  "kind": "Service",
  "apiVersion": "v1beta1",
  "port": 8181,
  "containerPort": 8080,
  "publicIPs": ["172.16.0.4", "172.16.0.5"],
  "selector": {
    "group": "abc",
    "component": "publisher"
  },
  "labels": {
    "group": "abc"
  }
  • 172.16.0.4 and 172.16.0.5 are Internal IP Addressess (Azure terms) of kube-01 and kube-02 respectively

  • On 172.16.0.4 (kube-01) I've got an Azure endpoint defined with public port set to 8181 and private port set to 8181

  • On 172.16.0.5 (kube-02) I've got an Azure endpoint defined with public port set to 8182 and private port set to 8181

With such a setup I can successfully access publisher-service using my VM public virtual IP (VIP) address and port 8181.

However I would expect to be also able to reach the publisher-service using the same VIP address and port 8182 (as it is mapped to port 8181 on kube-02). Instead curl reports Recv failure: Connection reset by peer.

Am I doing anything wrong here? Maybe my understanding of Kubernetes External Services is incorrect (and hence my expectation is wrong)?

I also noticed in /var/log/upstart/kube-proxy the following entries logged:

E0404 17:36:33.371889    1661 proxier.go:82] Dial failed: dial tcp 10.0.86.26:8080: i/o timeout
E0404 17:36:33.371951    1661 proxier.go:110] Failed to connect to balancer: failed to connect to an endpoint.

Here is a part of iptables -L -t nat output captured on 172.16.0.5 (kube-02):

Chain KUBE-PORTALS-CONTAINER (1 references)
target     prot opt source               destination
REDIRECT   tcp  --  anywhere             11.1.1.2             /* kubernetes */ tcp dpt:https redir ports 45717
REDIRECT   tcp  --  anywhere             11.1.1.1             /* kubernetes-ro */ tcp dpt:http redir ports 34122
REDIRECT   tcp  --  anywhere             11.1.1.221           /* publisher-service */ tcp dpt:8181 redir ports 48046
REDIRECT   tcp  --  anywhere             172.16.0.4           /* publisher-service */ tcp dpt:8181 redir ports 48046
REDIRECT   tcp  --  anywhere             172.16.0.5           /* publisher-service */ tcp dpt:8181 redir ports 48046

Chain KUBE-PORTALS-HOST (1 references)
target     prot opt source               destination
DNAT       tcp  --  anywhere             11.1.1.2             /* kubernetes */ tcp dpt:https to:172.16.0.5:45717
DNAT       tcp  --  anywhere             11.1.1.1             /* kubernetes-ro */ tcp dpt:http to:172.16.0.5:34122
DNAT       tcp  --  anywhere             11.1.1.221           /* publisher-service */ tcp dpt:8181 to:172.16.0.5:48046
DNAT       tcp  --  anywhere             172.16.0.4           /* publisher-service */ tcp dpt:8181 to:172.16.0.5:48046
DNAT       tcp  --  anywhere             172.16.0.5           /* publisher-service */ tcp dpt:8181 to:172.16.0.5:48046

I am using Kubernetes v0.12.0. I followed this guide to setup my cluster (i.e. I'm using flannel).


UPDATE #1: added publisher pod status info.

apiVersion: v1beta1
creationTimestamp: 2015-04-04T13:24:47Z
currentState:
  Condition:
  - kind: Ready
    status: Full
  host: 172.16.0.4
  hostIP: 172.16.0.4
  info:
    publisher:
      containerID: docker://6eabf71d507ad0086b37940931aa739534ef681906994a6aae6d97b8b213
      image: xxxxx.cloudapp.net/publisher:0.0.2
      imageID: docker://5a76329ae2d0dce05fae6f7b1216e346cef2e5aa49899cd829a5dc1f6e70
      ready: true
      restartCount: 5
      state:
        running:
          startedAt: 2015-04-04T13:26:24Z
  manifest:
    containers: null
    id: ""
    restartPolicy: {}
    version: ""
    volumes: null
  podIP: 10.0.86.26
  status: Running
desiredState:
  manifest:
    containers:
    - capabilities: {}
      command:
      - sh
      - -c
      - java -jar publisher.jar -b $KAFKA_SERVICE_HOST:$KAFKA_SERVICE_PORT
      image: xxxxx.cloudapp.net/publisher:0.0.2
      imagePullPolicy: PullIfNotPresent
      name: publisher
      ports:
      - containerPort: 8080
        hostPort: 8080
        protocol: TCP
      resources: {}
      terminationMessagePath: /dev/termination-log
    dnsPolicy: ClusterFirst
    id: ""
    restartPolicy:
      always: {}
    version: v1beta2
    volumes: null
generateName: rc-publisher-
id: rc-publisher-ls6k1
kind: Pod
labels:
  group: abc
namespace: default
resourceVersion: 22853
selfLink: /api/v1beta1/pods/rc-publisher-ls6k1?namespace=default
uid: f746555d-dacd-11e4-8ae7-000d3a101fda
-- begie
azure
kubernetes
load-balancing

2 Answers

5/26/2015

Once I reinstalled my cluster using k8s v0.14.2 everything started to work as expected. I followed Brendan Burns Docker Guide.

-- begie
Source: StackOverflow

4/4/2015

The external networking actually appears to be working fine -- the message you see in the logs is because the kube-proxy did receive the request you sent to it.

The reason it failed, though, is that the kube-proxy couldn't talk to your pod. Either flannel is failing to route to your pod's IP properly, or the pod isn't healthy. Since sending requests to 172.16.0.4 works, it's likely that something is wrong with your flannel setup. You can confirm this by trying to curl 10.0.86.26:8080 from node-2.

In case it may be something wrong with the health of the pod, you can check its detailed state by running kubectl.sh get pod $POD_NAME --output=yaml.

Sorry for the difficulties!

-- Alex Robinson
Source: StackOverflow