Why Do I Need a NodePort in My Local Kubernetes Cluster?

7/18/2021

Excuse my relative networking ignorance, but I've read a lot of docs and still have trouble understanding this (perhaps due to lack of background in networks).

Given this Dockerfile:

from node:lts-slim
RUN mkdir /code
COPY package.json /code/
WORKDIR /code
RUN npm install
COPY server.js /code/
EXPOSE 3000 
CMD ["node", "server.js"]

...this deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-deployment
spec:
  replicas: 2
  selector:
    matchLabels:
      app: web-pod
  template:
    metadata:
      labels:
        app: web-pod
    spec:
      containers:
      - name: web
        image: kahunacohen/hello-k8s
        ports:
        - containerPort: 3000
          protocol: TCP

and this service:

apiVersion: v1
kind: Service
metadata:
  name: web-service
spec:
  type: NodePort
  selector:
    app: web-pod
  ports:
    - port: 80
      targetPort: 3000
      protocol: TCP
      name: http

My understanding is that:

  1. The app in my container is exposing itself to the outside world on 3000
  2. my deployment yaml is saying, "the container is listening on 3000"
  3. my service is saying map 3000 internally to port 80, which is the default port, so you don't have to add the port to the host.
  4. I'm using the NodePort type because on local clusters like Docker Desktop it works out of the box instead of LoadBalancer. It opens up a random port on every node (pod?) to the outside in the cluster between 30000–32767. That node port is how I access my app from outside. E.g. localhost:30543.

Are my assumptions correct? I am unclear why I can't access my app at localhost:80, or just localhost, if the service makes the mapping between the container port and the outside world? What's the point of the mapping between 3000 and 80 in the service?

In short, why do I need NodePort?

-- Aaron
kubernetes

1 Answer

7/18/2021

There are two networking layers, which we could call "inside the cluster" and "outside the cluster". The Pod and the Service each have their own IP address, but these are only inside the cluster. You need the NodePort to forward a request from outside the cluster to inside the cluster.

In a "real" Kubernetes cluster, you'd make a request...

  1. ...to http://any-kubernetes-node.example.com:31245/, with a "normal" IP address in the way you'd expect a physical system to have, connecting to the NodePort port, which forwards...
  2. ...to http://web-service.default.svc.cluster.local:80/, with a cluster-internal IP address and the service port, which looks at the pods it selects and forwards...
  3. ...to http://10.20.30.40:3000/, using the cluster-internal IP address of any of the matching pods and the target port from the service.

The containerPort: in the pod spec isn't strictly required (but if you give it name: http then you can have the service specify targetPort: http without knowing the specific port number). EXPOSE in the Dockerfile means pretty much nothing in this sequence.

This sequence also gives you some flexibility in not needing to know where things are running. Say you have 100 nodes and 3 replicas of your pod; the initial connection can be to any node, and the service will forward to all of the target pods, without you needing to know any of these details from the caller.

(For completeness, a LoadBalancer type service requests that a load balancer be created outside the cluster; for example, an AWS ELB. This forwards to any of the cluster nodes as in step 1 above. If you're not in a cloud environment and the cluster doesn't know how to create the external load balancer automatically, it's the same as NodePort.)

If we reduce this to a local Kubernetes installation (Docker Desktop, minikube, kind) the only real difference is that there's only one node; the underlying infrastructure is still built as though it were a multi-node distributed cluster. How exactly you access a service differs across these installations. In Docker Desktop, from the host system, you can use localhost as the "normal" "external" node IP address in the first step.

-- David Maze
Source: StackOverflow