Managing Cassandra data storage size and backups under Kubernetes

11/29/2016

I would like to run a Cassandra cluster under Kubernetes on Google Container Engine using the examples given here: https://github.com/kubernetes/kubernetes/tree/master/examples/storage/cassandra

The file describes 3 ways to setup the cluster - PetSet(StatefulSet), Replication Controller and DaemonSet. Each one of them has its pros and cons.

While trying to choose the best setup for me, I noticed that I cannot figure out what to do with the storage and backups.

  1. How can I set or scale the storage size (increase/decrease node/cluster data storage size without data loss) ?
  2. How do I manage backups and restores?
-- Idan
cassandra
database-backups
google-cloud-platform
kubernetes

2 Answers

2/3/2017

The short answer is that there is no way to do this in kubernetes. Kubernetes does very little in terms of storage management.

If you have the flexibility of choosing other solutions, check this out.

They provide a container-based solution that combines compute, network, storage, so you have full control over all resources required by cassandra, and perform snapshot/restore, scale out, scale up/down, etc.

-- afulay
Source: StackOverflow

12/4/2016

You should definitely check out Flocker and Flockerhub from ClusterHQ. I've been playing around with their products in order to prove with a POC that containerized sharded db's can be done in an easy and manageable way. Make sure to check them out: https://clusterhq.com/

They are handling data the same way as docker images are being handled. So you will be able to push and pull data volumes into a hub/repository.

-- jonas kint
Source: StackOverflow