Hadoop 2.9 MultiNodes

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Hadoop 2.9 MultiNodes相关的知识,希望对你有一定的参考价值。

我有3台服务器Centos 7(禁用防火墙和selinux)chadoop1(主机),chadoop2(从机)和chadoop3(从机)

当我开始服务时,节点没有,我在jps上看到,不显示DataNode和NodeManager。

所有配置都是节点上的rsync(从站除外)

我尝试重新格式化,显示确定,但同样的问题。

我的目的是:/ opt / hadoop

CONFIGS:

HDFS-site.xml中

<configuration>
    <property>
            <name>dfs.data.dir</name>
            <value>/opt/hadoop/dfs/name/data</value>
            <final>true</final>
    </property>
    <property>
            <name>dfs.name.dir</name>
            <value>/opt/hadoop/dfs/name</value>
            <final>true</final>
    </property>
    <property>
            <name>dfs.replication</name>
            <value>2</value>
    </property>

核心的site.xml

<configuration>

<property>
    <name>fs.defaultFS</name>
    <value>hdfs://localhost:8020/</value>
    <description>NameNode URI</description>
</property>

<property>
  <name>io.file.buffer.size</name>
  <value>131072</value>
  <description>Buffer size</description>
</property>

mapred-site.xml中

<configuration>

<property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
    <description>MapReduce framework name</description>
</property>

<property>
  <name>mapreduce.jobhistory.address</name>
  <value>localhost:10020</value>
  <description>Default port is 10020.</description>
</property>

<property>
  <name>mapreduce.jobhistory.webapp.address</name>
  <value>localhost:19888</value>
  <description>Default port is 19888.</description>
</property>

<property>
  <name>mapreduce.jobhistory.intermediate-done-dir</name>
  <value>/mr-history/tmp</value>
  <description>Directory where history files are written by MapReduce jobs.</description>
</property>

<property>
  <name>mapreduce.jobhistory.done-dir</name>
  <value>/mr-history/done</value>
  <description>Directory where history files are managed by the MR JobHistory Server.</description>
</property>

纱的site.xml

<configuration>

<property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
    <description>MapReduce framework name</description>
</property>

<property>
  <name>mapreduce.jobhistory.address</name>
  <value>localhost:10020</value>
  <description>Default port is 10020.</description>
</property>

<property>
  <name>mapreduce.jobhistory.webapp.address</name>
  <value>localhost:19888</value>
  <description>Default port is 19888.</description>
</property>

<property>
  <name>mapreduce.jobhistory.intermediate-done-dir</name>
  <value>/mr-history/tmp</value>
  <description>Directory where history files are written by MapReduce jobs.</description>
</property>

<property>
  <name>mapreduce.jobhistory.done-dir</name>
  <value>/mr-history/done</value>
  <description>Directory where history files are managed by the MR JobHistory Server.</description>
</property>

从站(仅在主站上,在从站中具有localhost)

chadoop3
chadoop4

开始服务

[hadoop@chadoop1 hadoop]$ start-dfs.sh
 Starting namenodes on [localhost]
 localhost: starting namenode, logging to /opt/hadoop/logs/hadoop-hadoop- 
 namenode-chadoop1.out
 chadoop4: starting datanode, logging to /opt/hadoop/logs/hadoop-hadoop- 
 datanode-chadoop4.out
 chadoop3: starting datanode, logging to /opt/hadoop/logs/hadoop-hadoop-                    
 datanode-chadoop3.out
 Starting secondary namenodes [0.0.0.0]
 0.0.0.0: starting secondarynamenode, logging to /opt/hadoop/logs/hadoop- 
 hadoop-secondarynamenode-chadoop1.out

 [hadoop@chadoop1 hadoop]$ jps
 5603 Jps
 5492 SecondaryNameNode
 5291 NameNode
 [hadoop@chadoop1 hadoop]$ start-yarn.sh
 starting yarn daemons
 starting resourcemanager, logging to /opt/hadoop/logs/yarn-hadoop-               
 resourcemanager-chadoop1.out
 chadoop3: starting nodemanager, logging to /opt/hadoop/logs/yarn-hadoop- 
 nodemanager-chadoop3.out
 chadoop4: starting nodemanager, logging to /opt/hadoop/logs/yarn-hadoop- 
 nodemanager-chadoop4.out
 [hadoop@chadoop1 hadoop]$ jps
 5492 SecondaryNameNode
 5658 ResourceManager
 5914 Jps
 5291 NameNode
答案

所有配置都是节点上的rsync(从站除外)

所有配置必须在所有节点上。

话虽这么说,数据节点需要知道网络中NameNode存在的位置,因此如果服务器实际上应该是奴隶,则进程不能在localhost上。因此,您必须输入实际的主机名。

YARN服务也是如此。

我在jps上看到,不要显示DataNode和NodeManager。

从显示的输出中,您似乎只在主计算机上启动了服务,而不是那些服务存在的两个从服务器。

启动脚本只控制一台机器,而不是集群,jps只显示本地机器的Java进程


顺便说一句,Apache Ambari使安装和管理Hadoop集群变得更加容易。

以上是关于Hadoop 2.9 MultiNodes的主要内容,如果未能解决你的问题,请参考以下文章

2.9 MRJob编写和运行MapReduce

我的数据节点没有在 hadoop 2.7.3 多节点中启动

Hive虚拟内存溢出报错:2.9GB of 2.1GB virtual memory used. Killing container.解决办法

Linux-Hadoop安装

Hadoop Mapper参数含义

hadoop之为什么不能一直格式化namenode