尝试使用hadoop创建单节点集群

Posted

技术标签:

【中文标题】尝试使用hadoop创建单节点集群【英文标题】:Trying to create single node cluster using hadoop 【发布时间】:2018-09-25 13:11:36 【问题描述】:

在创建单节点集群时遇到问题,我有一个主节点 (172.16.x.xx) 和从节点 (172.16.x.xxx)。目前,当我运行 start-dfs.sh 和 start-yarn.sh 以下显示在执行 jps 时,

8085 NodeManager
7943 ResourceManager
7772 SecondaryNameNode
8350 Jps

As you can see no datanode displayed in hadoop

以下是我使用过的配置:

core-site.xml 配置

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>
<property>
        <name>fs.defaultFS</name>
        <value>hdfs://master-hostname:9000</value>
    </property>
<property>
<name>hadoop.tmp.dir</name>
<value>/tmp/hdata</value>
</property>
</configuration>

hdfs-site.xml 配置

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>
<property>
            <name>dfs.namenode.name.dir</name>
            <value>/home/vishnu/hadoopdata/data/namenode</value>
    </property>

    <property>
            <name>dfs.datanode.data.dir</name>
            <value>/home/vishnu/hadoopdata/data/datanode</value>
    </property>
<property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
</configuration>

mapred-site.xml 配置

<configuration>
<property> 
      <name>mapred.job.tracker</name> 
      <value>vishnu-Latitude-3480:9001</value> 
   </property>
<property>
            <name>mapreduce.framework.name</name>
            <value>yarn</value>
    </property>
<property>
        <name>yarn.app.mapreduce.am.resource.mb</name>
        <value>512</value>
</property>

<property>
        <name>mapreduce.map.memory.mb</name>
        <value>256</value>
</property>

<property>
        <name>mapreduce.reduce.memory.mb</name>
        <value>256</value>
</property>
</configuration>

yarn-site.xml 配置

<configuration>

<!-- Site specific YARN configuration properties -->
<property>
            <name>yarn.acl.enable</name>
            <value>0</value>
    </property>
    <property>
            <name>yarn.resourcemanager.hostname</name>
            <value>localhost</value>
    </property>
<property>
        <name>yarn.nodemanager.resource.memory-mb</name>
        <value>1536</value>
</property>

<property>
        <name>yarn.scheduler.maximum-allocation-mb</name>
        <value>1536</value>
</property>

<property>
        <name>yarn.scheduler.minimum-allocation-mb</name>
        <value>128</value>
</property>

<property>
        <name>yarn.nodemanager.vmem-check-enabled</name>
        <value>false</value>
</property>
    <property>
            <name>yarn.nodemanager.aux-services</name>
            <value>mapreduce_shuffle</value>
    </property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>

我已将相同的包复制到我的客户端计算机。下面是master中生成的日志:

2018-09-25 17:39:49,830 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: hostname/IP:9000. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-09-25 17:39:50,831 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: hostname/IP:9000. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-09-25 17:39:51,833 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: hostname/IP:9000. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-09-25 17:39:52,835 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: hostname/IP:9000. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-09-25 17:39:53,837 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: hostname/IP:9000. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-09-25 17:39:54,837 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: hostname/IP:9000. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-09-25 17:39:55,838 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: hostname/IP:9000. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-09-25 17:39:56,838 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: hostname/IP:9000. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-09-25 17:39:56,985 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool BP-1842429760-172.16.5.93-1537874717995 (Datanode Uuid 0eb1e76d-61a9-4280-aec1-ca83b9da5e33) service to hostname/IP:9000 is shutting down
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.UnregisteredNodeException): Unregistered server: DatanodeRegistration(IP:50010, datanodeUuid=0eb1e76d-61a9-4280-aec1-ca83b9da5e33, infoPort=50075, infoSecurePort=0, ipcPort=50020, storageInfo=lv=-57;cid=CID-846ea28f-1eac-4959-994c-c7c721b37f39;nsid=1040776628;c=1537874717995)
    at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.verifyRequest(NameNodeRpcServer.java:1580)
    at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.sendHeartbeat(NameNodeRpcServer.java:1431)
    at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.sendHeartbeat(DatanodeProtocolServerSideTranslatorPB.java:119)
    at org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:30585)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:503)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989)
    at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:871)
    at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:817)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1889)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2606)

    at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1493)
    at org.apache.hadoop.ipc.Client.call(Client.java:1439)
    at org.apache.hadoop.ipc.Client.call(Client.java:1349)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:227)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
    at com.sun.proxy.$Proxy15.sendHeartbeat(Unknown Source)
    at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.sendHeartbeat(DatanodeProtocolClientSideTranslatorPB.java:166)
    at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:514)
    at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:645)
    at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:841)
    at java.lang.Thread.run(Thread.java:748)
2018-09-25 17:39:56,985 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool BP-1842429760-172.16.5.93-1537874717995 (Datanode Uuid 0eb1e76d-61a9-4280-aec1-ca83b9da5e33) service to vishnu-Latitude-3480/172.16.5.93:9000
2018-09-25 17:39:57,086 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Removed Block pool BP-1842429760-172.16.5.93-1537874717995 (Datanode Uuid 0eb1e76d-61a9-4280-aec1-ca83b9da5e33)
2018-09-25 17:39:57,086 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Removing block pool BP-1842429760-172.16.5.93-1537874717995
2018-09-25 17:39:57,086 WARN org.apache.hadoop.fs.CachingGetSpaceUsed: Thread Interrupted waiting to refresh disk information: sleep interrupted
2018-09-25 17:39:59,087 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting Datanode
2018-09-25 17:39:59,097 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at Hostname/IP
************************************************************/

【问题讨论】:

mapred.job.tracker 在 Hadoop2+ 中不存在...您在关注什么教程?另外,您应该使用主机名或外部 IP 而不是 localhost 【参考方案1】:

根据日志内容,namenodedatanodeclusterID不一致。你是否多次格式化namenode,如果是这样namenode会在每次格式化时更新clusterID,但datanode只有在第一次格式化时才会被确定。打开

中配置的datanodenamenode目录
hdfs-site.xml 
<property>
            <name>dfs.namenode.name.dir</name>
            <value>/home/vishnu/hadoopdata/data/namenode</value>
    </property>
<property>
        <name>dfs.datanode.data.dir</name>
        <value>/home/vishnu/hadoopdata/data/datanode</value>
</property>  

打开current folder中的VERSION文件夹,可以看到clusterID。如果不一致,修改datanode节点的id或者删除目录重新启动。

【讨论】:

以上是关于尝试使用hadoop创建单节点集群的主要内容,如果未能解决你的问题,请参考以下文章

在 Hadoop 单集群节点中格式化名称节点时出错

如何安装单节点的hadoop

创建单节点Hadoop集群

创建单节点Hadoop集群

Hadoop伪集群(单节点集群)安装(Centos7)

Hadoop伪集群(单节点集群)安装(Centos7)