Excuse my relative networking ignorance, but I've read a lot of docs and still have trouble understanding this (perhaps due to lack of background in networks).
Given this Dockerfile:
from node:lts-slim
RUN mkdir /code
COPY package.json /code/
WORKDIR /code
RUN npm install
COPY server.js /code/
EXPOSE 3000
CMD ["node", "server.js"]
...this deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-deployment
spec:
replicas: 2
selector:
matchLabels:
app: web-pod
template:
metadata:
labels:
app: web-pod
spec:
containers:
- name: web
image: kahunacohen/hello-k8s
ports:
- containerPort: 3000
protocol: TCP
and this service:
apiVersion: v1
kind: Service
metadata:
name: web-service
spec:
type: NodePort
selector:
app: web-pod
ports:
- port: 80
targetPort: 3000
protocol: TCP
name: http
My understanding is that:
Are my assumptions correct? I am unclear why I can't access my app at localhost:80, or just localhost, if the service makes the mapping between the container port and the outside world? What's the point of the mapping between 3000 and 80 in the service?
In short, why do I need NodePort?
There are two networking layers, which we could call "inside the cluster" and "outside the cluster". The Pod and the Service each have their own IP address, but these are only inside the cluster. You need the NodePort to forward a request from outside the cluster to inside the cluster.
In a "real" Kubernetes cluster, you'd make a request...
http://any-kubernetes-node.example.com:31245/
, with a "normal" IP address in the way you'd expect a physical system to have, connecting to the NodePort port, which forwards...http://web-service.default.svc.cluster.local:80/
, with a cluster-internal IP address and the service port, which looks at the pods it selects and forwards...http://10.20.30.40:3000/
, using the cluster-internal IP address of any of the matching pods and the target port from the service.The containerPort:
in the pod spec isn't strictly required (but if you give it name: http
then you can have the service specify targetPort: http
without knowing the specific port number). EXPOSE
in the Dockerfile means pretty much nothing in this sequence.
This sequence also gives you some flexibility in not needing to know where things are running. Say you have 100 nodes and 3 replicas of your pod; the initial connection can be to any node, and the service will forward to all of the target pods, without you needing to know any of these details from the caller.
(For completeness, a LoadBalancer type service requests that a load balancer be created outside the cluster; for example, an AWS ELB. This forwards to any of the cluster nodes as in step 1 above. If you're not in a cloud environment and the cluster doesn't know how to create the external load balancer automatically, it's the same as NodePort.)
If we reduce this to a local Kubernetes installation (Docker Desktop, minikube, kind) the only real difference is that there's only one node; the underlying infrastructure is still built as though it were a multi-node distributed cluster. How exactly you access a service differs across these installations. In Docker Desktop, from the host system, you can use localhost
as the "normal" "external" node IP address in the first step.