I use a helm chart to deploy 2 RabbitMQ pods on k8s. The chart can be deployed well and cluster well firstly. Then add a tls definition for k8s API:
--tls-cipher-suites=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
Then the RabbitMQ peer discovery plugin failed to cluster.
[root@control-01]$ # kubectl get pod -o wide
oe-crmq-0 0/1 CrashLoopBackOff 7 33m 192.168.1.186 worker-01
oe-crmq-1 0/1 CrashLoopBackOff 7 32m 192.168.1.105 worker-02
[root@control-01]$ # kubectl logs oe-crmq-0
## ##
## ## RabbitMQ 3.7.5. Copyright (C) 2007-2018 Pivotal Software, Inc.
########## Licensed under the MPL. See http://www.rabbitmq.com/
###### ##
########## Logs: /var/log/rabbitmq/rabbit@oe-crmq-0.log
/var/log/rabbitmq/rabbit@oe-crmq-0_upgrade.log
Starting broker...
{"Kernel pid terminated",application_controller,"{application_start_failure,rabbit,{bad_return,{{rabbit,start,[normal,[]]},{'EXIT',{{case_clause,{error,\"{failed_connect,[{to_address,{\\"kubernetes.default.svc.cluster.local\\",8443}},\n {inet,[inet],etimedout}]}\"}},[{rabbit_mnesia,init_from_config,0,[{file,\"src/rabbit_mnesia.erl\"},{line,164}]},{rabbit_mnesia,init_with_lock,3,[{file,\"src/rabbit_mnesia.erl\"},{line,144}]},{rabbit_mnesia,init,0,[{file,\"src/rabbit_mnesia.erl\"},{line,111}]},{rabbit_boot_steps,'-run_step/2-lc$^1/1-1-',1,[{file,\"src/rabbit_boot_steps.erl\"},{line,49}]},{rabbit_boot_steps,run_step,2,[{file,\"src/rabbit_boot_steps.erl\"},{line,49}]},{rabbit_boot_steps,'-run_boot_steps/1-lc$^0/1-0-',1,[{file,\"src/rabbit_boot_steps.erl\"},{line,26}]},{rabbit_boot_steps,run_boot_steps,1,[{file,\"src/rabbit_boot_steps.erl\"},{line,26}]},{rabbit,start,2,[{file,\"src/rabbit.erl\"},{line,801}]}]}}}}}"}
Kernel pid terminated (application_controller) ({application_start_failure,rabbit,{bad_return,{{rabbit,start,[normal,[]]},{'EXIT',{{case_clause,{error,"{failed_connect,[{to_address,{\"kubernetes.defau
Crash dump is being written to: /var/log/rabbitmq/erl_crash.dump...done
So I try to add cipher for RabbitMQ in the advanced.config:
bash-4.2$ cat advanced.config
%% List allowed ciphers
[
{ssl, [{versions, ['tlsv1.2']},
{ssl_optons, [{ciphers, [
{ecdhe_rsa,aes_128_gcm,null,sha256}
]}, {fail_if_no_peer_cert,false}]}]}
].
While this config haven't bring much improvemnt. It still timeout when connect to kubernetes.default.svc.cluster.local.
After upgrading erlang to latest version the issue resolved.