How to Configure Kubernetes in Hairpin Mode

3/30/2020

I'm trying to enable hairpin connections on my Kubernetes service, on GKE.

I've tried to follow the instructions here: https://kubernetes.io/docs/tasks/administer-cluster/reconfigure-kubelet/ to configure my kubelet config to enable hairpin mode, but it looks like my configs are never saved, even though the edit command returns without error.

Here is what I try to set when I edit node:

spec:
  podCIDR: 10.4.1.0/24
  providerID: gce://staging/us-east4-b/gke-cluster-staging-highmem-f36fb529-cfnv
  configSource:
    configMap:
      name: my-node-config-4kbd7d944d
      namespace: kube-system
      kubeletConfigKey: kubelet

Here is my node config when I describe it

Name:         my-node-config-4kbd7d944d
Namespace:    kube-system
Labels:       <none>
Annotations:  <none>

Data
====
kubelet_config:
----
{
  "kind": "KubeletConfiguration",
  "apiVersion": "kubelet.config.k8s.io/v1beta1",
  "hairpinMode": "hairpin-veth"
}

I've tried both using "edit node" and "patch". Same result in that nothing is saved. Patch returns "no changes made."

Here is the patch command from the tutorial:

kubectl patch node ${NODE_NAME} -p "{\"spec\":{\"configSource\":{\"configMap\":{\"name\":\"${CONFIG_MAP_NAME}\",\"namespace\":\"kube-system\",\"kubeletConfigKey\":\"kubelet\"}}}}"

I also can't find any resource on where the "hairpinMode" attribute is supposed to be set.

Any help is appreciated!

------------------- edit ----------------

here is why I think hairpinning isn't working.

root@668cb9686f-dzcx8:/app# nslookup tasks-staging.[my-domain].com
Server:     10.0.32.10
Address:    10.0.32.10#53

Non-authoritative answer:
Name:   tasks-staging.[my-domain].com
Address: 34.102.170.43

root@668cb9686f-dzcx8:/app# curl https://[my-domain].com/python/healthz
hello
root@668cb9686f-dzcx8:/app# nslookup my-service.default
Server:     10.0.32.10
Address:    10.0.32.10#53

Name:   my-service.default.svc.cluster.local
Address: 10.0.38.76

root@668cb9686f-dzcx8:/app# curl https://my-service.default.svc.cluster.local/python/healthz
curl: (7) Failed to connect to my-service.default.svc.cluster.local port 443: Connection timed out

also if I issue a request to localhost from my service (not curl), it gets a "connection refused." Issuing requests to the external domain, which should get routed to the same pod, is fine though.

I only have one service, one node, one pod, and two listening ports at the moment.

--------------------- including deployment yaml ----------------- Deployment

spec:
  replicas: 1
    spec:
      containers:
      - name: my-app
        ports:
        - containerPort: 8080
        - containerPort: 50001
        readinessProbe:
          httpGet:
            path: /healthz
            port: 8080
            scheme: HTTPS

Ingress:

apiVersion: extensions/v1beta1
kind: Ingress
spec:
  backend:
    serviceName: my-service
    servicePort: 60000
  rules:
  - http:
      paths:
      - path: /*
        backend:
          serviceName: my-service
          servicePort: 60000
      - path: /python/*
        backend:
          serviceName: my-service
          servicePort: 60001

service

---
apiVersion: v1
kind: Service
metadata:
  name: my-service
spec:
  ports:
  - name: port
    port: 60000
    targetPort: 8080
  - name: python-port
    port: 60001
    targetPort: 50001
  type: NodePort

I'm trying to set up a multi-port application where the main program trigger a script to run through issuing a request on the local machine on a different port. (I need to run something in python but the main app is in golang.)

It's a simple script and I'd like to avoid exposing the python endpoints with the external domain, so I don't have to worry about authentication, etc.

-------------- requests sent from my-service in golang -------------

https://[my-domain]/health: success
https://[my-domain]/python/healthz: success
http://my-service.default:60000/healthz: dial tcp: lookup my-service.default on 169.254.169.254:53: no such host 
http://my-service.default/python/healthz: dial tcp: lookup my-service.default on 169.254.169.254:53: no such host 
http://my-service.default:60001/python/healthz: dial tcp: lookup my-service.default on 169.254.169.254:53: no such host 
http://localhost:50001/healthz: dial tcp 127.0.0.1:50001: connect: connection refused 
http://localhost:50001/python/healthz: dial tcp 127.0.0.1:50001: connect: connection refused 
-- Jason Chen
google-kubernetes-engine
kubernetes

1 Answer

4/2/2020

Kubelet reconfiguration in GKE


You should not reconfigure kubelet in cloud managed Kubernetes clusters like GKE. It's not supported and it can lead to errors and failures.

Hairpinning in GKE


Hairpinning is enabled by default in GKE provided clusters. You can check if it's enabled by invoking below command on one of the GKE nodes:

ifconfig cbr0 |grep PROMISC

The output should look like that:

UP BROADCAST RUNNING PROMISC MULTICAST MTU:1460 Metric:1

Where the PROMISC will indicate that the hairpinning is enabled.

Please refer to official documentation about debugging services: Kubernetes.io: Debug service: a pod fails to reach itself via the service ip

Workload


Basing only on service definition you provided, you should have an access to your python application on port 50001 with a pod hosting it with:

  • localhost:50001
  • ClusterIP:60001
  • my-service:60001
  • NodeIP:nodeport-port (check $ kubectl get svc my-service for this port)

I tried to run your Ingress resource and it failed to create. Please check how Ingress definition should look like.

Please take a look on official documentation where whole deployment process is explained with examples:

Additionally please check other StackOverflow answers like:

Please let me know if you have any questions to that.

-- Dawid Kruk
Source: StackOverflow