How to ingest the boston housing dataset into Cassandra in Kubernetes?

6/5/2019

I am new to Kubernetes and have tried to set up my first cluster using minikube. I have installed Cassandra using helm chart throug the following.

helm install bitnami/cassandra

I have Cassandra running right now on one pod. I would like to explore and understand how I can interact with Cassandra inside my Kubernetes cluster.

My goal right now is therefore to ingest the Boston Housing dataset into Cassandra. And I have tried to read up on how this is done in Kubernetes. Has anyone done anything similar to this? And what is the correct way to ingest data into Cassandra in kubernetes? I have a hard time finding the right information on how to do this. Is it done through jobs?

Would love any tips or insights into this.

-- kkss
cassandra
kubernetes

1 Answer

6/10/2019

Before installing Cassandra via helm, you can fetch it to local current foler via:

$ helm fetch bitnami/cassandra --untar
$ cd cassandra

Then in folder and create job template there, and add to hook annotations to of this template and helm will recognize it as hook not as part of release.

  ...
  annotations:
    # This is what defines this resource as a hook. Without this line, the
    # job is considered part of the release.
    "helm.sh/hook": post-install # It will run after deploying all resources
    # Job will be deleted after successfully completed
    "helm.sh/hook-delete-policy": hook-succeeded 
    ...

You can see full example template of helm hook in official doc

After adding your hook job template, you can install your chart via:

$ # Make sure you are in cassandra folder
$ pwd
~/cassandra

$ # And install
$ helm install cassandra .

Related more about kubernetes jobs, you can visit official documentation

Hope it helps!

-- coolinuxoid
Source: StackOverflow