How does kubernetes pod gets IP instead of container instead of it as CNI plugin works at container level

2/24/2019

How does kubernetes pod gets IP instead of container instead of it as CNI plugin works at container level?

How all containers of same pod share same network stack?

-- Bharath Thiruveedula
docker
flannel
flanneld
kubernetes
project-calico

4 Answers

2/24/2019

its the kubeproxy that makes everything work. one pod has one proxy which translates all the ports over one IP for the remaining containers. only in specific cases it is said that you want to have multiple containers in the same pod. its not preferred but its possible. this is why they call it "tightly" coupled. please refer to: https://kubernetes.io/docs/concepts/cluster-administration/proxies/

-- jcuypers
Source: StackOverflow

2/24/2019

There is a special container called 'pause container' that holds the network namespace for the pod. It does not do anything and its container process just goes into sleep.

Kubernetes creates one pause container for each pod, to acquire the respective pod's IP address and set up the network namespace for all other containers that are part of specific pod. All containers in a pod can reach each other using localhost.

This means that your 'application' container can die, and come back to life, and all of the network setup will still be intact.

-- P Ekambaram
Source: StackOverflow

2/24/2019

Firstly, let's dig deeper into the CNI aspect. In production systems, workload/pod (workload can be thought of as one or many containerized applications used to fulfill a certain function) network isolation is a first class security requirement. Moreover, depending on how the infrastructure is set up, the routing plane might also need to be a attribute of either the workload (kubectl proxy), or the host-level proxy (kube proxy), or the central routing plane (apiserver proxy) that the host-level proxy exposes a gateway for.

For both service discovery, and actually sending requests from a workload/pod, you don't want individual application developers to talk to the apiserver proxy, since it may incur overhead. Instead you want them to communicate with other applications via either the kubectl or kube proxy, with those layers being responsible for knowing when and how to communicate with the apiserver plane.

Therefore, when spinning up a new workload, the kubelet can be passed --network-plugin=cni and a path to a configuration telling kubelet how to set up the virtual network interface for this workload/pod.

For example, if you dont want your application containers in pod to be able to talk to host-level kube proxy directly, since you want to do some infrastructure specific monitoring, your CNI and workload configuration would be:

  • monitoring at outermost container
  • outermost container creates virtual network interface for every other container in pod
  • outermost container is on bridge interface (also a private virtual network interface) that can talk to kube proxy on host

The IP that the pod gets is to allow other workloads to send bytes to this pod, via its bridge interface - since fundamentally, other people should be talking to the pod, not individual work units inside the pod.

-- Debosmit Ray
Source: StackOverflow

2/24/2019

Containers use a feature from the kernel called virtual network interface, the virtual network Interface( lets name it veth0) is created and then assigned to a namespace, when a container is created, it is also assigned to a namespace, when multiple containers are created within the same namespace, only a single network interface veth0 will be created.

A POD is just the term used to specify a set of resources and features, one of them is the namespace and the containers running in it.

When you say the POD get an IP, what actually get the ip is the veth0 interface, container apps see the veth0 the same way applications outside a container see a single physical network card on a server.

CNI is just the technical specification on how it should work to make multiple network plugins work without changes to the platform. The process above should be the same to all network plugins.

There is a nice explanation in this blog post

-- Diego Mendes
Source: StackOverflow