Maintaining replicated database in kubernetes

5/18/2016

I have replicated cassandra database and would like to know the best way to maintain its data.

Currently im using kubernetes emptyDir for cassandra container volume.

  1. Can i use google's Persistent disks for replicated cassandra db pods?
  2. If i have 3 cassandra nodes and one of them fails / destroyed what happens to the google's Persistent disks data?
  3. If all 3 nodes fail, will i still be able to populate db data from google's persistant disks to new pods that spins up?
  4. How to backup db's data which is in google's persistent disks?
-- sravis
cassandra
cassandra-2.0
google-cloud-storage
kubernetes
persistent-storage

1 Answer

5/31/2016

I will answer your questions in the same order:

1: You can use Google's persistent disks for the master Cassandra node and then all the other cassandra replicas will just use their local emptyDir.

2: When deploying to the cloud, the expectation is that instances are ephemeral and might die at any time. Cassandra is built to replicate data across the cluster to facilitate data redundancy, so that in the case that an instance dies, the data stored on the instance does not, and the cluster can react by re-replicating the data to other running nodes. You can use DaemonSet to place a single pod on each node in the Kubernetes cluster which will give u data redundancy.

  1. Is it possible to provide more information here? how the new pods will spin up?

  2. Taking a snapshot of the disk, or use epmtyDir with a sidecar container in order to periodically snapshot the directory and upload it to Google Cloud Storage.

-- George
Source: StackOverflow