Datanode denied communication with namenode because hostname cannot be resolved

6/8/2017

I ran a hadoop cluster in kubernetes, with 4 journalnodes and 2 namenodes. Sometimes, my datanodes cannot register to namenodes.

17/06/08 07:45:32 INFO datanode.DataNode: Block pool BP-541956668-10.100.81.42-1496827795971 (Datanode Uuid null) service to hadoop-namenode-0.myhadoopcluster/10.100.81.42:8020 beginning handshake with NN
17/06/08 07:45:32 ERROR datanode.DataNode: Initialization failed for Block pool BP-541956668-10.100.81.42-1496827795971 (Datanode Uuid null) service to hadoop-namenode-0.myhadoopcluster/10.100.81.42:8020 Datanode denied communication with namenode because hostname cannot be resolved (ip=10.100.9.45, hostname=10.100.9.45): DatanodeRegistration(0.0.0.0:50010, datanodeUuid=b1babba6-9a6f-40dc-933b-08885cbd358e, infoPort=50075, infoSecurePort=0, ipcPort=50020, storageInfo=lv=-56;cid=CID-bceaa23f-ba3d-4749-a542-74cda1e82e07;nsid=177502984;c=0)
    at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:863)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:4529)
    at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:1279)
    at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:95)
    at org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:28539)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)

It says:

hadoop-namenode-0.myhadoopcluster/10.100.81.42:8020 Datanode denied communication with namenode because hostname cannot be resolved (ip=10.100.9.45, hostname=10.100.9.45)

However, I can ping hadoop-namenode-0.myhadoopcluster, 10.100.81.42, 10.100.9.45 in both the datanode and the namenode.

/etc/hosts in datanode:

127.0.0.1   localhost
::1 localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet
fe00::0 ip6-mcastprefix
fe00::1 ip6-allnodes
fe00::2 ip6-allrouters
10.100.9.45 hadoop-datanode-0.myhadoopcluster.default.svc.cluster.local hadoop-datanode-0

/etc/hosts in namenode:

# Kubernetes-managed hosts file.
127.0.0.1 localhost
::1 localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet
fe00::0 ip6-mcastprefix
fe00::1 ip6-allnodes
fe00::2 ip6-allrouters
10.100.81.42 hadoop-namenode-0.myhadoopcluster.default.svc.cluster.local hadoop-namenode-0

And I have already set dfs.namenode.datanode.registration.ip-hostname-check to false in hdfs-site.xml

I guess the problem may be related to dns. And in other similar problems, hadoop are not deployed in kubernetes or docker container, so I posted this one. Please do not tag it as duplicated...

-- Haoyuan Ge
docker
hadoop
kubernetes

2 Answers

9/4/2017

I hope you found a resolution to the issue by now. I ran into similar problem last week, but my cluster is set up in a different environment but the problem context is same.

Essentially , the reverse DNS lookup needs to be set up to solve this issue if the cluster is using a DNS Resolver then this needs to be set up at the DNS server level or if the Name Nodes are looking into /etc/hosts file to find Data Nodes then there needs to be any entry for the Data nodes there.

I have updated an old question in Hortonworks Community Forum Post,Link as below: https://community.hortonworks.com/questions/24320/datanode-denied-communication-with-namenode.html?childToView=135321#answer-135321

-- Kfactor21
Source: StackOverflow

9/4/2017

In my situation, I included three configuration to the namenode and datanode as well:

  • dfs.namenode.datanode.registratin.io-hostname-check: false
  • dfs.client.use.datanode.hostname: false
  • dfs.datanode.use.datanode.hostname: false
-- hxquangnhat
Source: StackOverflow