I have used redis:5.0.1-alpine in my statefulset, the stateful set has 6 pods and the redis cluster formation is done using below command
redis-cli --cluster create {IPlist is placed here} --cluster-replicas 1
Now in case the pods get accidentally deleted or the AKS gets out of service, then the pods when create after AKS resumes will have different IP.
I tried by deliberately deleting the pods, when the pods get recreated then the cluster state changes to "fail" ( which was "ok" when the cluster was initially created)
Also when I try to get the old data set into cluster, a message appears telling that "cluster is down"
I have displayed the code for redis.conf file used for cluster creation
apiVersion: v1
kind: ConfigMap
metadata:
name: redis-cluster
namespace: redis
data:
update-node.sh: |
#!/bin/sh
REDIS_NODES="/data/nodes.conf"
sed -i -e "/myself/ s/[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-
9]\{1,3\}/${POD_IP}/" ${REDIS_NODES}
exec "$@"
redis.conf: |+
cluster-enabled yes
cluster-require-full-coverage no
cluster-node-timeout 15000
cluster-config-file /data/nodes.conf
cluster-migration-barrier 1
appendonly yes
protected-mode no
Redis Cluster Nodes and Slots related data is as attached redis cluster nodes and slots
When you restart single pod, the pod goes up with a new IP, publish it to other pods and they all update their configuration regarding the IP change.
In case of all pods goes down and up on the same time (for example in case all nodes in the cluster are rebooted), the pods cannot talk with each other since the IPs in their nodes.conf is wrong.
A possible solution is to update the IPs in nodes.conf on all running pods and restart them one-by-one.
I did it by implanting this script in each pod:
recover-pod.sh
#!/bin/sh
set -e
REDIS_NODES_FILE="/data/nodes.conf"
for redis_node_ip in "$@"
do
redis_node_id=`redis-cli -h $redis_node_ip -p 6379 cluster nodes | grep myself | awk '{print $1}'`
sed -i.bak -e "/^$redis_node_id/ s/[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}/${redis_node_ip}/" ${REDIS_NODES_FILE}
done
And running it from one of the Kubernetes nodes:
recover-cluster.sh
#!/bin/bash
for i in {0..5}
do
echo "Updating the correct IPs in nodes.conf on redis-cluster-redis-cluster-statefulset-$i"
kubectl exec -it redis-cluster-redis-cluster-statefulset-$i /readonly-config/recover-pod.sh $(kubectl get pods -l app=redis-cluster -o jsonpath='{range.items[*]}{.status.podIP} ' )
done
kubectl patch statefulset redis-cluster-redis-cluster-statefulset --patch '{"spec": {"template": {"metadata": {"labels": {"date": "'`date +%s`'" }}}}}'
This causes Redis cluster to return to healthy status.