Why do certain kube-system Pods such as kube-proxy have the same Pod IP as the node that they are on?

2/19/2019

I noticed something that I did not expect and can't find an explanation for today in one of my clusters. The Pod IPs of many kube-system Pods are the same as the node that they are on. I want to understand why that is, but I can't find any documentation on the matter or discussions elsewhere. Here's what I see:

k get nodes -o wide
NAME                       STATUS   ROLES   AGE   VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION      CONTAINER-RUNTIME
aks-agentpool-14855512-0   Ready    agent   47m   v1.12.5   10.240.0.66   <none>        Ubuntu 16.04.5 LTS   4.15.0-1037-azure   docker://3.0.4
aks-agentpool-14855512-1   Ready    agent   47m   v1.12.5   10.240.0.4    <none>        Ubuntu 16.04.5 LTS   4.15.0-1037-azure   docker://3.0.4
aks-agentpool-14855512-2   Ready    agent   47m   v1.12.5   10.240.0.35   <none>        Ubuntu 16.04.5 LTS   4.15.0-1037-azure   docker://3.0.4
 k get po -n kube-system -o wide | grep '10.240.0.4 '
azure-cni-networkmonitor-rqs8q       1/1     Running   0          48m   10.240.0.4    aks-agentpool-14855512-1   <none>
azure-ip-masq-agent-dj8w5            1/1     Running   0          48m   10.240.0.4    aks-agentpool-14855512-1   <none>
kube-proxy-jpjjc                     1/1     Running   0          48m   10.240.0.4    aks-agentpool-14855512-1   <none>
kube-svc-redirect-bfvlk              2/2     Running   0          48m   10.240.0.4    aks-agentpool-14855512-1   <none>

My understanding is that the Pods should have a different IP than the node that it is on. To expose the Pod, a Service should be used. However, I don't see that this is the case:

k get svc -n kube-system
NAME                   TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)         AGE
heapster               ClusterIP   10.0.0.57     <none>        80/TCP          55m
kube-dns               ClusterIP   10.0.0.10     <none>        53/UDP,53/TCP   55m
kubernetes-dashboard   ClusterIP   10.0.105.92   <none>        80/TCP          55m
metrics-server         ClusterIP   10.0.179.25   <none>        443/TCP         55m

At first I thought that this implementation was specific to AKS, however, the same results hold true for GKE.

I'm afraid I may be missing a very fundamental concept that is preventing me from understanding this better. Any assistance would be appreciated.

Update: This is because in the Pod YAML hostNetworking: true is set.

You can observe this by running:

k get po kube-proxy-jpjjc  -n kube-system -o yaml
apiVersion: v1
kind: Pod
metadata:
  annotations:
    aks.microsoft.com/release-time: 'seconds:1550597164 nanos:675278758 '
  creationTimestamp: "2019-02-19T17:29:15Z"
  generateName: kube-proxy-
  labels:
    component: kube-proxy
    controller-revision-hash: 68c8cf5db6
    pod-template-generation: "1"
    tier: node
  name: kube-proxy-jpjjc
  namespace: kube-system
  ownerReferences:
  - apiVersion: apps/v1
    blockOwnerDeletion: true
    controller: true
    kind: DaemonSet
    name: kube-proxy
    uid: 75df85c8-346b-11e9-a1db-667e55a73bba
  resourceVersion: "693"
  selfLink: /api/v1/namespaces/kube-system/pods/kube-proxy-jpjjc
  uid: e1004b3e-346b-11e9-a1db-667e55a73bba
spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchFields:
          - key: metadata.name
            operator: In
            values:
            - aks-agentpool-14855512-1
  containers:
  - command:
    - /hyperkube
    - proxy
    - --kubeconfig=/var/lib/kubelet/kubeconfig
    - --cluster-cidr=10.240.0.0/16
    - --feature-gates=ExperimentalCriticalPodAnnotation=true
    env:
    - name: KUBERNETES_PORT_443_TCP_ADDR
      value: nodeport-test-cni-87e6d01c.hcp.westus2.azmk8s.io
    - name: KUBERNETES_PORT
      value: tcp://nodeport-test-cni-87e6d01c.hcp.westus2.azmk8s.io:443
    - name: KUBERNETES_PORT_443_TCP
      value: tcp://nodeport-test-cni-87e6d01c.hcp.westus2.azmk8s.io:443
    - name: KUBERNETES_SERVICE_HOST
      value: nodeport-test-cni-87e6d01c.hcp.westus2.azmk8s.io
    image: k8s.gcr.io/hyperkube-amd64:v1.12.5
    imagePullPolicy: IfNotPresent
    name: kube-proxy
    resources:
      requests:
        cpu: 100m
    securityContext:
      privileged: true
      procMount: Default
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /var/lib/kubelet
      name: kubeconfig
      readOnly: true
    - mountPath: /etc/kubernetes/certs
      name: certificates
      readOnly: true
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: default-token-7m959
      readOnly: true
  dnsPolicy: ClusterFirst
  hostNetwork: true
  nodeName: aks-agentpool-14855512-1
  nodeSelector:
    beta.kubernetes.io/os: linux
  priority: 1000000
  priorityClassName: high-priority
  restartPolicy: Always
  schedulerName: default-scheduler
  securityContext: {}
  serviceAccount: default
  serviceAccountName: default
  terminationGracePeriodSeconds: 30
  tolerations:
  - effect: NoSchedule
    key: node-role.kubernetes.io/master
    operator: Equal
    value: "true"
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
  - effect: NoSchedule
    key: node.kubernetes.io/disk-pressure
    operator: Exists
  - effect: NoSchedule
    key: node.kubernetes.io/memory-pressure
    operator: Exists
  - effect: NoSchedule
    key: node.kubernetes.io/unschedulable
    operator: Exists
  - effect: NoSchedule
    key: node.kubernetes.io/network-unavailable
    operator: Exists
  volumes:
  - hostPath:
      path: /var/lib/kubelet
      type: ""
    name: kubeconfig
  - hostPath:
      path: /etc/kubernetes/certs
      type: ""
    name: certificates
  - name: default-token-7m959
    secret:
      defaultMode: 420
      secretName: default-token-7m959
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2019-02-19T17:29:18Z"
    status: "True"
    type: Initialized
  - lastProbeTime: null
    lastTransitionTime: "2019-02-19T17:29:29Z"
    status: "True"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: "2019-02-19T17:29:29Z"
    status: "True"
    type: ContainersReady
  - lastProbeTime: null
    lastTransitionTime: "2019-02-19T17:29:15Z"
    status: "True"
    type: PodScheduled
  containerStatuses:
  - containerID: docker://8934a2ec756bf77ad34b352ab78f70f41c7a52f126e511b235378b65c708ff15
    image: k8s.gcr.io/hyperkube-amd64:v1.12.5
    imageID: docker-pullable://k8s.gcr.io/hyperkube-amd64@sha256:82add6703e6e28b50f2457b3a3e4eec573a2603437cb9df1af5670dd7e640e75
    lastState: {}
    name: kube-proxy
    ready: true
    restartCount: 0
    state:
      running:
        startedAt: "2019-02-19T17:29:28Z"
  hostIP: 10.240.0.4
  phase: Running
  podIP: 10.240.0.4
  qosClass: Burstable
  startTime: "2019-02-19T17:29:18Z"
-- jnat
kubernetes

1 Answer

2/19/2019

This is because in the Pod YAML hostNetworking: true is set.

You can observe this by running:

k get po kube-proxy-jpjjc  -n kube-system -o yaml
apiVersion: v1
kind: Pod
metadata:
  annotations:
    aks.microsoft.com/release-time: 'seconds:1550597164 nanos:675278758 '
  creationTimestamp: "2019-02-19T17:29:15Z"
  generateName: kube-proxy-
  labels:
    component: kube-proxy
    controller-revision-hash: 68c8cf5db6
    pod-template-generation: "1"
    tier: node
  name: kube-proxy-jpjjc
  namespace: kube-system
  ownerReferences:
  - apiVersion: apps/v1
    blockOwnerDeletion: true
    controller: true
    kind: DaemonSet
    name: kube-proxy
    uid: 75df85c8-346b-11e9-a1db-667e55a73bba
  resourceVersion: "693"
  selfLink: /api/v1/namespaces/kube-system/pods/kube-proxy-jpjjc
  uid: e1004b3e-346b-11e9-a1db-667e55a73bba
spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchFields:
          - key: metadata.name
            operator: In
            values:
            - aks-agentpool-14855512-1
  containers:
  - command:
    - /hyperkube
    - proxy
    - --kubeconfig=/var/lib/kubelet/kubeconfig
    - --cluster-cidr=10.240.0.0/16
    - --feature-gates=ExperimentalCriticalPodAnnotation=true
    env:
    - name: KUBERNETES_PORT_443_TCP_ADDR
      value: nodeport-test-cni-87e6d01c.hcp.westus2.azmk8s.io
    - name: KUBERNETES_PORT
      value: tcp://nodeport-test-cni-87e6d01c.hcp.westus2.azmk8s.io:443
    - name: KUBERNETES_PORT_443_TCP
      value: tcp://nodeport-test-cni-87e6d01c.hcp.westus2.azmk8s.io:443
    - name: KUBERNETES_SERVICE_HOST
      value: nodeport-test-cni-87e6d01c.hcp.westus2.azmk8s.io
    image: k8s.gcr.io/hyperkube-amd64:v1.12.5
    imagePullPolicy: IfNotPresent
    name: kube-proxy
    resources:
      requests:
        cpu: 100m
    securityContext:
      privileged: true
      procMount: Default
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /var/lib/kubelet
      name: kubeconfig
      readOnly: true
    - mountPath: /etc/kubernetes/certs
      name: certificates
      readOnly: true
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: default-token-7m959
      readOnly: true
  dnsPolicy: ClusterFirst
  hostNetwork: true
  nodeName: aks-agentpool-14855512-1
  nodeSelector:
    beta.kubernetes.io/os: linux
  priority: 1000000
  priorityClassName: high-priority
  restartPolicy: Always
  schedulerName: default-scheduler
  securityContext: {}
  serviceAccount: default
  serviceAccountName: default
  terminationGracePeriodSeconds: 30
  tolerations:
  - effect: NoSchedule
    key: node-role.kubernetes.io/master
    operator: Equal
    value: "true"
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
  - effect: NoSchedule
    key: node.kubernetes.io/disk-pressure
    operator: Exists
  - effect: NoSchedule
    key: node.kubernetes.io/memory-pressure
    operator: Exists
  - effect: NoSchedule
    key: node.kubernetes.io/unschedulable
    operator: Exists
  - effect: NoSchedule
    key: node.kubernetes.io/network-unavailable
    operator: Exists
  volumes:
  - hostPath:
      path: /var/lib/kubelet
      type: ""
    name: kubeconfig
  - hostPath:
      path: /etc/kubernetes/certs
      type: ""
    name: certificates
  - name: default-token-7m959
    secret:
      defaultMode: 420
      secretName: default-token-7m959
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2019-02-19T17:29:18Z"
    status: "True"
    type: Initialized
  - lastProbeTime: null
    lastTransitionTime: "2019-02-19T17:29:29Z"
    status: "True"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: "2019-02-19T17:29:29Z"
    status: "True"
    type: ContainersReady
  - lastProbeTime: null
    lastTransitionTime: "2019-02-19T17:29:15Z"
    status: "True"
    type: PodScheduled
  containerStatuses:
  - containerID: docker://8934a2ec756bf77ad34b352ab78f70f41c7a52f126e511b235378b65c708ff15
    image: k8s.gcr.io/hyperkube-amd64:v1.12.5
    imageID: docker-pullable://k8s.gcr.io/hyperkube-amd64@sha256:82add6703e6e28b50f2457b3a3e4eec573a2603437cb9df1af5670dd7e640e75
    lastState: {}
    name: kube-proxy
    ready: true
    restartCount: 0
    state:
      running:
        startedAt: "2019-02-19T17:29:28Z"
  hostIP: 10.240.0.4
  phase: Running
  podIP: 10.240.0.4
  qosClass: Burstable
  startTime: "2019-02-19T17:29:18Z"
-- jnat
Source: StackOverflow