I'm unable to read from HBase (1.3) using Spark (2.4.3) in Kubernetes. The driver pod and executor pods are launched successfully, however, when the driver pod attempts to make the call to HBase, it fails on the connection to HBase with this error:
java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
I think the root cause is due to this:
INFO ClientCnxn: Opening socket connection to server localhost/127.0.0.1:2181
The spark driver and executor pods are looking for zookeeper in localhost, rather than the remote host. Additionally, these pods seem to ignore the hbase-site.xml that I provide.
I've placed my hbase-site.xml with the correct remote host information on $SPARK_HOME/conf/ directory of my custom spark image. I've tested that the connection can be made from the kubernetes to hbase by running a kubernetes pod deployment with the custom spark image with a command this command to keep it up and running in the yaml:
command: ["sleep"]
args: ["infinity"]
Then, I remote in to this pod through kubectl exec -it <pod> bash
. Here, I run the exact same script through spark-submit that I use to read from HBase in Spark standalone local mode and I'm able to successfully read from HBase. The method I'm using to connect to HBase from spark is through the shc connector.
When I run the exact same script, this time, with the spark-submit now pointing to the k8 cluster, it fails.
For some reason, the hbase-site.xml is ignored and the driver and executor pods look for zookeeper through localhost:2181.
Inside the deployment pod, I use the two commands to test:
Spark local mode connecting to HBase cluster from outside Kubernetes
spark-submit \
--jars=/hbase-jars/* \
--files=gs://<project>/dependencies/hbase-site.xml \
gs://<project>/dependencies/test-read.py
hbase-jars are where shc and relevant hbase jars are stored. I've placed my hbase-site.xml on the gcs for ease of testing. This test is able to return a read from hbase.
Spark kubernetes submission:
--master k8s://https://<ip> \
--deploy-mode cluster \
--conf spark.driver.memory=2G \
--conf spark.executor.memory=2G \
--conf spark.executor.instances=2 \
--conf spark.kubernetes.executor.request.cores=1 \
--conf spark.kubernetes.namespace=default \
--conf spark.authenticate.driver.serviceAccountName=default \
--conf spark.kubernetes.container.image=gcr.io/<project>/spark-hbase:latest \
--jars=/hbase-jars/* \
--files=gs://<project>/dependencies/hbase-site.xml \
gs://<project>/dependencies/test-read.py
Returns the errors from above. I've also connected to the running driver and executor pod to run a local spark-submit and get the same errors I do when I run the spark-submit for Kubernetes
Is there some sort of environment variable that I need to set in order for the driver and executor pods to successfully read the hbase-site.xml? Am I supplying the hbase-site.xml incorrectly in the spark-submit? Help is greatly appreciated!