Redis - Avoid Data Loss using Cluster (Using Gossip Protocol)

8/29/2018

We would like to deploy Airflow application on Kubernetes on 2 Data centers.

Airflow Schedular container generates DAGs for every 1min, 5mins and 10 mins. These DAGs are the tasks that will be assigned to Airflow Worker container.

In the process of assigning tasks to Airflow worker, Airflow Schedular sends the data about tasks to both MariaDb (can be considered as a source of Truth) and Redis.

In MariaDB task can have one of the statuses 'queued', 'running', 'success', 'failed'. When tasks are in Redis it will be in the state of 'queued'.

MariaDB maintains the same when it receives tasks from Airflow Schedular. when Redis handovers particular queued task to Worker container, MariaDB changes that particular task status to 'running' and if it completes executing process, the task status in MariaDB will be changed to 'Success'.

The actual problem:

When Redis fails, we have queued tasks in MariaDB but we will lose data in Redis. When the k8s brings up new Redis server, it will lose previous tasks- here comes the DATA LOSS.

What can be the solution for this.

Can we use Redis Clustering - Gossip protocol to avoid data loss:

If yes, could you provide any documentation in resolving this problem using this protocol. else provide the suggestions that suits to my environment and scenario.

-- sireesha rk
airflow-scheduler
hazelcast
kubernetes
mariadb
redis-cluster

1 Answer

8/29/2018

Redis clustering would help with it, but it is a bit of a pain to set up and it's not a complete replacement for backups.

In your case a much simpler solution in my opinion would be to incorporate a recovery procedure in your redis pod startup. You do not have a permanent data loss as you have your MariaDB source of truth, so you can add an init containerthat runs a script to recover redis data from MariaDB.

Another approach that would limit your problem significantly would be to use a persistent volume to store redis data as redis can snapshot it's in-memory state in regular intervals. With use of StatefulSet instead of Deployment to manage your Redis node(s) pods would get the storage reattached on restart/rescheduling and you'd experiens no data loss (or at most the tiny window since the last snapshot)

-- Radek 'Goblin' Pieczonka
Source: StackOverflow