Resolving a pod's IP by name

10/9/2018

First off a disclaimer: I have only been using Azure's Kubernetes framework for a short while so my apologies for asking what might be an easy problem.

I have two Kubernetes services running in AKS. I want these services to be able to discover each other by service name. The pods associated with these services are each given an IP from the subnet I've assigned to my cluster:

$ kubectl get pods -o wide
NAME      READY   STATUS    RESTARTS   AGE   IP         ...
tom       1/1     Running   0          69m   10.0.2.10  ...
jerry     1/1     Running   5          67m   10.0.2.21  ...

If I make REST calls between these services using their pod IPs directly, the calls work as expected. I don't want to of course use hard coded IPs. In reading up on kube dns, my understanding is that entries for registered services are created in the dns. The tests I've done confirms this, but the IP addresses assigned to the dns entries are not the IP addresses of the pods. For example:

$ kubectl exec jerry -- ping -c 1 tom.default
PING tom.default (10.1.0.246): 56 data bytes

The IP address that is associated with the service tom is the so-called "cluster ip":

$ kubectl get services
NAME   TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)    AGE
tom    ClusterIP   10.1.0.246   <none>        6010/TCP   21m
jerry  ClusterIP   10.1.0.247   <none>        6040/TCP   20m

The same is true with the service jerry. The problem with these IP addresses is that REST calls using these addresses do not work. Even a simple ping times out. So my question is how can I associate the kube-dns entry that's created for a service with the pod IP instead of the cluster IP?

Based on the posted answer, I updated my yml file for "tom" as follows:

apiVersion: apps/v1beta1
kind: Deployment
metadata:
  name: tom
spec:
  template:
    metadata:
      labels:
        app: tom
    spec:
      containers:
      - name: tom
        image: myregistry.azurecr.io/tom:latest
        imagePullPolicy: Always
        ports:
          - containerPort: 6010
---
  apiVersion: v1
  kind: Service
  metadata:
    name: tom
  spec:
    ports:
    - port: 6010
      name: "6010"
    selector:
      app: tom

and then re-applied the update. I still get the cluster IP though when I try to resolve tom.default, not the pod IP. I'm still missing part of the puzzle.

Update: As requested, here's the describe output for tom:

$ kubectl describe service tom
Name:              tom
Namespace:         default
Labels:            <none>
Annotations:       kubectl.kubernetes.io/last-applied-configuration:
                     {"apiVersion":"v1","kind":"Service","metadata":{"annotations":{},"name":"tom","namespace":"default"},"spec":{"ports":[{"name":"6010","po...
Selector:          app=tom
Type:              ClusterIP
IP:                10.1.0.139
Port:              6010  6010/TCP
TargetPort:        6010/TCP
Endpoints:         10.0.2.10:6010

The output is similar for the service jerry. As you can see, the endpoint is what I'd expect--10.0.2.10 is the IP assigned to the pod associated with the service tom. Kube DNS though resolves the name "tom" as the cluster IP, not the pod IP:

$ kubectl get pods
NAME                    READY   STATUS    RESTARTS   AGE   IP ...
tom-b4ccbfb97-wfmjp     1/1     Running   0          15h   10.0.2.10
jerry-dd8fbf98f-8jgw7   1/1     Running   0          14h   10.0.2.20

$ kubectl exec jerry-dd8fbf98f-8jgw7 nslookup tom
Name:      tom
Address 1: 10.1.0.139 tom.default.svc.cluster.local

This doesn't really matter of course as long as REST calls are routed to the expected pod IP. I've had some success with this today:

$ kubectl exec jerry-5554b956b-9kpj7 -- wget -O - http://tom:6010/actuator/health
{"status":"UP"}

This shows that even though the name "tom" resolves to the cluster IP there is routing in place that makes sure the call gets to the pod. I've tried the same call from service tom to service jerry and that also works. Curiously, a loopback, from tom to tom, times out:

$ kubectl exec tom-5c68d66cf9-dxlmf -- wget -O - http://tom:6010/actuator/health
Connecting to tom:6010 (10.1.0.139:6010)
wget: can't connect to remote host (10.1.0.139): Operation timed out
command terminated with exit code 1

If I use the pod IP explicitly, the call works:

$ kubectl exec tom-5c68d66cf9-dxlmf -- wget -O - http://10.0.2.10:6010/actuator/health
{"status":"UP"}

So for some reason the routing doesn't work in the loopback case. I can probably get by with that since I don't think we'll need to make calls back to the same service. It is puzzling though.

Peter

-- user3280383
azure
azure-aks
kubernetes

1 Answer

10/9/2018

This means you didnt publish ports through your service (or used wrong labels). What you are trying to achieve should be done using services exactly, what you need to do is fix your service definition so that it works properly.

apiVersion: apps/v1beta1
kind: Deployment
metadata:
  name: xxx-name
spec:
  template:
    metadata:
      labels:
        app: xxx-label
    spec:
      containers:
      - name: xxx-container
        image: kmrcr.azurecr.io/image:0.7
        imagePullPolicy: Always
        ports:
          - containerPort: 7003
          - containerPort: 443

---
  apiVersion: v1
  kind: Service
  metadata:
    name: xxx-service
  spec:
    ports:
    - port: 7003
      name: "7003"
    - port: 443
      name: "443"
    selector:
      app: xxx-label < must match your pod label
    type: LoadBalancer

notice how this exposes same ports container is listening on and uses the same label as selector to determine to which pods the traffic must go

-- 4c74356b41
Source: StackOverflow