I am trying to build a redis cluster with kubernetes on azure. And I am faced with the exact same problem when running different samples : sanderp.nl/running-redis-cluster-on-kubernetes or github.com/zuxqoj/kubernetes-redis-cluster
Everything goes well until I try to have the different nodes join the cluster with the redis-trib
command. At that time I face the infamous infinite "Waiting for the cluster to join ...." message.
Trying to see what is happening, I set up the loglevel of the redis pods to debug
level. I then noticed that the pods do not seem to announce their correct ip when communicating together. In fact it seems that the last byte of the ip is replaced by a zero. Say if pod1 has ip address 10.1.34.9, I will see in pod2 logs:
Accepted clusternode 10.1.34.0:someport
So the pods do not seem to be able to communicate back and the join cluster process never ends.
Now, if before running redis-trib, I enforce the cluster-announce-ip by running on each pod :
redis-cli -h mypod-ip config set cluster-announce-ip mypod-ip
the redis-trib command then completes successfully and the cluster is up and running.
But this not a viable solution as if a pod goes down and comes back, it may have changed ip and I will face the same problem when it will try to join the cluster.
Note that I do not encounter any problem when running the samples with minikube.
I am using flannel for kubernetes networking. Can the problem come from incorrect configuration of flannel ? Has anyone encountered the same issue ?
You can use statefulsets for deploying your replicas, so your pod will have a unique name always.
Moreover, you will be able to use service
DNS-names as host. See this official doc DNS for Services and Pods.
The second example you shared, has another part for redis cluster using statefulsets. Try that out.