I have installed a kubernetes cluster on Azure
with kubespray 2.13.2
.
But after I have installed some pods of my data platform components,
I have noticed that the pods running on the same node cannot access to each other through service.
For example, my presto coordinator has to access hive metastore. Let's see the services in my namespace:
kubectl get svc -n ai-developer
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
metastore ClusterIP 10.233.12.66 <none> 9083/TCP 4h53m
Hive Metastore service is called metastore
, through which my presto coordinator has to access hive metastore pod.
Let's see the following pods in my namespace:
kubectl get po -n ai-developer -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
metastore-5544f95b6b-cqmkx 1/1 Running 0 9h 10.233.69.20 minion-3 <none> <none>
presto-coordinator-796c4c7bcd-7lngs 1/1 Running 0 5h32m 10.233.69.29 minion-3 <none> <none>
presto-worker-0 1/1 Running 0 5h32m 10.233.67.52 minion-1 <none> <none>
presto-worker-1 1/1 Running 0 5h32m 10.233.70.24 minion-4 <none> <none>
presto-worker-2 1/1 Running 0 5h31m 10.233.68.24 minion-2 <none> <none>
presto-worker-3 1/1 Running 0 5h31m 10.233.71.27 minion-0 <none> <none>
Take a look at that the hive metastore pod metastore-5544f95b6b-cqmkx
which is running on the node minion-3
on which presto coordinator pod presto-coordinator-796c4c7bcd-7lngs
also is running.
I have configured hive metastore url of thrift://metastore:9083
to hive properties for hive catalog in presto coordinator.
When the presto pods are running on that same node where hive metastore pod is running, they cannot access to my hive metastore, but the pod running on other node where hive metastore is not running can access to the hive metastore through service
very well.
I have mentioned just one example, but I have experienced several other cases like this example for now.
kubenet
is installed as network plugin in my kubernetes cluster installed with kubespray on azure
:
/usr/local/bin/kubelet --logtostderr=true --v=2 --node-ip=10.240.0.4 --hostname-override=minion-3 --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --config=/etc/kubernetes/kubelet-config.yaml --kubeconfig=/etc/kubernetes/kubelet.conf --pod-infra-container-image=k8s.gcr.io/pause:3.1 --runtime-cgroups=/systemd/system.slice --hairpin-mode=promiscuous-bridge --network-plugin=kubenet --cloud-provider=azure --cloud-config=/etc/kubernetes/cloud_config
Any idea?
you might be able to overcome this issue by using the fully qualified name k8s provides you for resolving service ips, as described in the k8s docsenter link description here.
In your case it will probably mean changing your thrift://metastore:9083
property to thrift://metastore.ai-developer.svc.cluster.local
(Assuming, of course, your cluster domain is configured to be cluster.local
)
After I have changed ipvs
of kube proxy mode to iptables
, it works fine!
Please check if the iptables Chain FORWARD default policy is ACCEPT . In my case , set the Forward chain default policy from drop to accept, the communitcation between nodes works well.