尝试使用hadoop创建单节点集群
Posted
技术标签:
【中文标题】尝试使用hadoop创建单节点集群【英文标题】:Trying to create single node cluster using hadoop 【发布时间】:2018-09-25 13:11:36 【问题描述】:在创建单节点集群时遇到问题,我有一个主节点 (172.16.x.xx) 和从节点 (172.16.x.xxx)。目前,当我运行 start-dfs.sh 和 start-yarn.sh 以下显示在执行 jps 时,
8085 NodeManager
7943 ResourceManager
7772 SecondaryNameNode
8350 Jps
As you can see no datanode displayed in hadoop
以下是我使用过的配置:
core-site.xml 配置
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://master-hostname:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/tmp/hdata</value>
</property>
</configuration>
hdfs-site.xml 配置
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>/home/vishnu/hadoopdata/data/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/home/vishnu/hadoopdata/data/datanode</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
mapred-site.xml 配置
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>vishnu-Latitude-3480:9001</value>
</property>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>yarn.app.mapreduce.am.resource.mb</name>
<value>512</value>
</property>
<property>
<name>mapreduce.map.memory.mb</name>
<value>256</value>
</property>
<property>
<name>mapreduce.reduce.memory.mb</name>
<value>256</value>
</property>
</configuration>
yarn-site.xml 配置
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.acl.enable</name>
<value>0</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>localhost</value>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>1536</value>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>1536</value>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>128</value>
</property>
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>
我已将相同的包复制到我的客户端计算机。下面是master中生成的日志:
2018-09-25 17:39:49,830 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: hostname/IP:9000. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-09-25 17:39:50,831 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: hostname/IP:9000. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-09-25 17:39:51,833 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: hostname/IP:9000. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-09-25 17:39:52,835 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: hostname/IP:9000. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-09-25 17:39:53,837 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: hostname/IP:9000. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-09-25 17:39:54,837 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: hostname/IP:9000. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-09-25 17:39:55,838 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: hostname/IP:9000. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-09-25 17:39:56,838 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: hostname/IP:9000. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-09-25 17:39:56,985 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool BP-1842429760-172.16.5.93-1537874717995 (Datanode Uuid 0eb1e76d-61a9-4280-aec1-ca83b9da5e33) service to hostname/IP:9000 is shutting down
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.UnregisteredNodeException): Unregistered server: DatanodeRegistration(IP:50010, datanodeUuid=0eb1e76d-61a9-4280-aec1-ca83b9da5e33, infoPort=50075, infoSecurePort=0, ipcPort=50020, storageInfo=lv=-57;cid=CID-846ea28f-1eac-4959-994c-c7c721b37f39;nsid=1040776628;c=1537874717995)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.verifyRequest(NameNodeRpcServer.java:1580)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.sendHeartbeat(NameNodeRpcServer.java:1431)
at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.sendHeartbeat(DatanodeProtocolServerSideTranslatorPB.java:119)
at org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:30585)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:503)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:871)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:817)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1889)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2606)
at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1493)
at org.apache.hadoop.ipc.Client.call(Client.java:1439)
at org.apache.hadoop.ipc.Client.call(Client.java:1349)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:227)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
at com.sun.proxy.$Proxy15.sendHeartbeat(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.sendHeartbeat(DatanodeProtocolClientSideTranslatorPB.java:166)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:514)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:645)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:841)
at java.lang.Thread.run(Thread.java:748)
2018-09-25 17:39:56,985 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool BP-1842429760-172.16.5.93-1537874717995 (Datanode Uuid 0eb1e76d-61a9-4280-aec1-ca83b9da5e33) service to vishnu-Latitude-3480/172.16.5.93:9000
2018-09-25 17:39:57,086 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Removed Block pool BP-1842429760-172.16.5.93-1537874717995 (Datanode Uuid 0eb1e76d-61a9-4280-aec1-ca83b9da5e33)
2018-09-25 17:39:57,086 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Removing block pool BP-1842429760-172.16.5.93-1537874717995
2018-09-25 17:39:57,086 WARN org.apache.hadoop.fs.CachingGetSpaceUsed: Thread Interrupted waiting to refresh disk information: sleep interrupted
2018-09-25 17:39:59,087 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting Datanode
2018-09-25 17:39:59,097 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at Hostname/IP
************************************************************/
【问题讨论】:
mapred.job.tracker
在 Hadoop2+ 中不存在...您在关注什么教程?另外,您应该使用主机名或外部 IP 而不是 localhost
【参考方案1】:
根据日志内容,namenode
和datanode
的clusterID
不一致。你是否多次格式化namenode,如果是这样namenode
会在每次格式化时更新clusterID
,但datanode
只有在第一次格式化时才会被确定。打开
datanode
和namenode
目录
hdfs-site.xml
<property>
<name>dfs.namenode.name.dir</name>
<value>/home/vishnu/hadoopdata/data/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/home/vishnu/hadoopdata/data/datanode</value>
</property>
打开current folder
中的VERSION文件夹,可以看到clusterID。如果不一致,修改datanode节点的id或者删除目录重新启动。
【讨论】:
以上是关于尝试使用hadoop创建单节点集群的主要内容,如果未能解决你的问题,请参考以下文章