Use full hostname for Apache Flink akka monitoring

4/12/2019

I'm setting up a single JobManager HA cluster w/ ZooKeeper and 3 TaskManagers. I'm doing this in kubernetes and need the JM to address the TMs by the full hostname (hostname -f not hostname) since they are in a stateful set.

Currently, it seems the TMs report their address to the JM as just the nostname...

2019-04-12 08:58:32,426 WARN  akka.remote.ReliableDeliverySupervisor                        - Association with remote system [akka.tcp://flink-metrics@flink-taskmanager-1:50011] has failed, address is now gated for [50] ms. Reason: [Association failed with [akka.tcp://flink-metrics@flink-taskmanager-1:50011]] Caused by: [flink-taskmanager-1: Name or service not known]
2019-04-12 08:58:32,427 WARN  akka.remote.ReliableDeliverySupervisor                        - Association with remote system [akka.tcp://flink-metrics@flink-taskmanager-0:50011] has failed, address is now gated for [50] ms. Reason: [Association failed with [akka.tcp://flink-metrics@flink-taskmanager-0:50011]] Caused by: [flink-taskmanager-0: Name or service not known]
2019-04-12 08:58:32,431 WARN  akka.remote.ReliableDeliverySupervisor                        - Association with remote system [akka.tcp://flink-metrics@flink-taskmanager-2:50011] has failed, address is now gated for [50] ms. Reason: [Association failed with [akka.tcp://flink-metrics@flink-taskmanager-2:50011]] Caused by: [flink-taskmanager-2: Name or service not known]

I've looked through the options here, specifically taskmanager.host, but it had no effect.

How can I get the TMs to report their address to the JM as either the full hostname, or a custom value?

root@flink-taskmanager-0:/opt/flink# hostname
flink-taskmanager-0
root@flink-taskmanager-0:/opt/flink# hostname -f
flink-taskmanager-0.fl-taskmanager-headless-service.default.svc.cluster.local
-- CodeCorrupt
apache-flink
kubernetes

1 Answer

4/12/2019

To answer my own question: taskmanager.host was the correct configuration. I don't know for sure what change I made made it start working, but my guess is that it's related to me clearing ZooKeeper's flink data.

-- CodeCorrupt
Source: StackOverflow