I'm setting up a single JobManager HA cluster w/ ZooKeeper and 3 TaskManagers. I'm doing this in kubernetes and need the JM to address the TMs by the full hostname (hostname -f
not hostname
) since they are in a stateful set.
Currently, it seems the TMs report their address to the JM as just the nostname...
2019-04-12 08:58:32,426 WARN akka.remote.ReliableDeliverySupervisor - Association with remote system [akka.tcp://flink-metrics@flink-taskmanager-1:50011] has failed, address is now gated for [50] ms. Reason: [Association failed with [akka.tcp://flink-metrics@flink-taskmanager-1:50011]] Caused by: [flink-taskmanager-1: Name or service not known]
2019-04-12 08:58:32,427 WARN akka.remote.ReliableDeliverySupervisor - Association with remote system [akka.tcp://flink-metrics@flink-taskmanager-0:50011] has failed, address is now gated for [50] ms. Reason: [Association failed with [akka.tcp://flink-metrics@flink-taskmanager-0:50011]] Caused by: [flink-taskmanager-0: Name or service not known]
2019-04-12 08:58:32,431 WARN akka.remote.ReliableDeliverySupervisor - Association with remote system [akka.tcp://flink-metrics@flink-taskmanager-2:50011] has failed, address is now gated for [50] ms. Reason: [Association failed with [akka.tcp://flink-metrics@flink-taskmanager-2:50011]] Caused by: [flink-taskmanager-2: Name or service not known]
I've looked through the options here, specifically taskmanager.host
, but it had no effect.
How can I get the TMs to report their address to the JM as either the full hostname, or a custom value?
root@flink-taskmanager-0:/opt/flink# hostname
flink-taskmanager-0
root@flink-taskmanager-0:/opt/flink# hostname -f
flink-taskmanager-0.fl-taskmanager-headless-service.default.svc.cluster.local
To answer my own question: taskmanager.host
was the correct configuration. I don't know for sure what change I made made it start working, but my guess is that it's related to me clearing ZooKeeper's flink data.