Pods running on the same node can't access to each other through service

9/26/2020

I have installed a kubernetes cluster on Azure with kubespray 2.13.2. But after I have installed some pods of my data platform components, I have noticed that the pods running on the same node cannot access to each other through service.

For example, my presto coordinator has to access hive metastore. Let's see the services in my namespace:

kubectl get svc -n ai-developer
NAME                                              TYPE           CLUSTER-IP      EXTERNAL-IP      PORT(S)                      AGE
metastore		                          ClusterIP      10.233.12.66    <none>           9083/TCP                     4h53m

Hive Metastore service is called metastore, through which my presto coordinator has to access hive metastore pod. Let's see the following pods in my namespace:

kubectl get po -n ai-developer -o wide
NAME                                          READY   STATUS      RESTARTS   AGE     IP             NODE       NOMINATED NODE   READINESS GATES
metastore-5544f95b6b-cqmkx                    1/1     Running     0          9h      10.233.69.20   minion-3   <none>           <none>
presto-coordinator-796c4c7bcd-7lngs           1/1     Running     0          5h32m   10.233.69.29   minion-3   <none>           <none>
presto-worker-0                               1/1     Running     0          5h32m   10.233.67.52   minion-1   <none>           <none>
presto-worker-1                               1/1     Running     0          5h32m   10.233.70.24   minion-4   <none>           <none>
presto-worker-2                               1/1     Running     0          5h31m   10.233.68.24   minion-2   <none>           <none>
presto-worker-3                               1/1     Running     0          5h31m   10.233.71.27   minion-0   <none>           <none>

Take a look at that the hive metastore pod metastore-5544f95b6b-cqmkx which is running on the node minion-3 on which presto coordinator pod presto-coordinator-796c4c7bcd-7lngs also is running.

I have configured hive metastore url of thrift://metastore:9083 to hive properties for hive catalog in presto coordinator. When the presto pods are running on that same node where hive metastore pod is running, they cannot access to my hive metastore, but the pod running on other node where hive metastore is not running can access to the hive metastore through service very well.

I have mentioned just one example, but I have experienced several other cases like this example for now.

kubenet is installed as network plugin in my kubernetes cluster installed with kubespray on azure:

/usr/local/bin/kubelet --logtostderr=true --v=2 --node-ip=10.240.0.4 --hostname-override=minion-3 --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --config=/etc/kubernetes/kubelet-config.yaml --kubeconfig=/etc/kubernetes/kubelet.conf --pod-infra-container-image=k8s.gcr.io/pause:3.1 --runtime-cgroups=/systemd/system.slice --hairpin-mode=promiscuous-bridge --network-plugin=kubenet --cloud-provider=azure --cloud-config=/etc/kubernetes/cloud_config

Any idea?

-- mykidong
kubernetes

3 Answers

9/26/2020

you might be able to overcome this issue by using the fully qualified name k8s provides you for resolving service ips, as described in the k8s docsenter link description here.

In your case it will probably mean changing your thrift://metastore:9083 property to thrift://metastore.ai-developer.svc.cluster.local (Assuming, of course, your cluster domain is configured to be cluster.local)

-- Yaron Idan
Source: StackOverflow

9/29/2020

After I have changed ipvs of kube proxy mode to iptables, it works fine!

-- mykidong
Source: StackOverflow

9/26/2020

Please check if the iptables Chain FORWARD default policy is ACCEPT . In my case , set the Forward chain default policy from drop to accept, the communitcation between nodes works well.

-- Paul
Source: StackOverflow