I use Flink 1.7 dashboard and select a streaming job. This should show me some metrics, but it remains to load.
I deployed the same job in a Flink 1.5 cluster, and I can watch the metrics. Flink is running in docker swarm, but if I run Flink 1.7 in docker-compose (not in the swarm), it works
I can do it work, deleting the hostname in docker-compose.yaml file
version: "3"
services:
jobmanager17:
image: flink:1.7.0-hadoop27-scala_2.11
hostname: "{{.Node.Hostname}}"
ports:
- "8081:8081"
- "9254:9249"
command: jobmanager
....
I delete the host name:
version: "3"
services:
jobmanager17:
image: flink:1.7.0-hadoop27-scala_2.11
ports:
- "8081:8081"
- "9254:9249"
command: jobmanager
....
and now the metrics works, but without the hostname...
Is it possible to have both?
PD: I read something about 'detached mode'... but I don't use it
I guess you are running your cluster on Kubernetes or docker swarm. With Flink 1.7 on Kubernetes you need to make sure the task managers are registering to the job manager with their IP addresses and not the hostnames. If you look at the jobmanagers log you'll find a lot of warnings that the Taskmanager can't be reached.
You can do that by passing defining the taskmanager.host
parameter. An example depoyment might look like this:
apiVersion: extensions/v1beta1
kind: Deployment
....
spec:
template:
spec:
containers:
- name: "<%= name %>"
args: ["taskmanager", "-Dtaskmanager.host=$(K8S_POD_IP)"]
env:
- name: K8S_POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
If you are not running on K8 it might be worth a try to pass this parameter manually (by providing an IP adress which is reachable from the jobmanager as the taskmanager.host
)
Hope that helps.
Update: Flink 1.8 solves the problem. The property taskmanager.network.bind-policy
is by default set to "ip" which does more or less the same what the above described workaround does (https://ci.apache.org/projects/flink/flink-docs-stable/ops/config.html#taskmanager)