I am trying local Kubernetes(Docker-on-mac), and trying to submit a spark job. The spark job, connects with a PostgreSQL database and do some calculations.
The PostgreSQL is running on my Kube and since I have published it, I can access it from the host via localhost:5432. However, when the spark application is trying to connect to PostgreSQL, it throws
Exception in thread "main" org.postgresql.util.PSQLException: Connection to localhost:5432 refused. Check that the hostname and port are correct and that the postmaster is accepting TCP/IP connections.
kubectl cluster-info
Kubernetes master is running at https://kubernetes.docker.internal:6443
KubeDNS is running at https://kubernetes.docker.internal:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
kubectl get service postgresql-published
kubectl describe service spark-store-1588217023181-driver-svc
Name: spark-store-1588217023181-driver-svc
Namespace: default
Labels: <none>
Annotations: <none>
Selector: spark-app-selector=spark-533ecb8556b6439eb938d487cc77c330,spark-role=driver
Type: ClusterIP
IP: None
Port: driver-rpc-port 7078/TCP
TargetPort: 7078/TCP
Endpoints: <none>
Port: blockmanager 7079/TCP
TargetPort: 7079/TCP
Endpoints: <none>
Session Affinity: None
How can I make my spark job, have access to PostgreSQL service?
NAME TYPE CLUSTER-IP EXTERNAL-IP PORTS
postgresql-published LoadBalancer 10.106.15.112 localhost 5432:31277
Means that the service shall be accessible within the cluster at 10.106.15.112:5432
, postgresql-published:5432
and externally at localhost:31277
.
Please note that for the Pod the localhost
is the Pod itself. In this very case localhost
looks ambiguous. However that is how the expose
works.
localhost
is there in EXTERNAL_IP
but Kubernetes cluster DNS system(CoreDNS) does not know how to resolve it to an IP address.EXTERNAL_IP
is supposed to be resolved by an external DNS server and it's generally meant to be used to connect to Postgres from outside the Kubernetes cluster(i.e from another system or from Kubernetes nodes as well) and not from the inside the cluster(i.e from another pod)
Postgres should be accessible from spark pod via 10.106.15.112:5432
or postgresql-published:5432
because kubernetes cluster DNS system knows how to resolve it.
Test the Postgres connectivity
kubectl run postgresql-postgresql-client --rm --tty -i --restart='Never' --namespace default --image bitnami/postgresql --env="PGPASSWORD=<HERE_YOUR_PASSWORD>" --command -- psql --host <HERE_HOSTNAME=SVC_OR_IP> -U <HERE_USERNAME>