Hadoop 2.9 MultiNodes
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Hadoop 2.9 MultiNodes相关的知识,希望对你有一定的参考价值。
我有3台服务器Centos 7(禁用防火墙和selinux)chadoop1(主机),chadoop2(从机)和chadoop3(从机)
当我开始服务时,节点没有,我在jps上看到,不显示DataNode和NodeManager。
所有配置都是节点上的rsync(从站除外)
我尝试重新格式化,显示确定,但同样的问题。
我的目的是:/ opt / hadoop
CONFIGS:
HDFS-site.xml中
<configuration>
<property>
<name>dfs.data.dir</name>
<value>/opt/hadoop/dfs/name/data</value>
<final>true</final>
</property>
<property>
<name>dfs.name.dir</name>
<value>/opt/hadoop/dfs/name</value>
<final>true</final>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
核心的site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:8020/</value>
<description>NameNode URI</description>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
<description>Buffer size</description>
</property>
mapred-site.xml中
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
<description>MapReduce framework name</description>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>localhost:10020</value>
<description>Default port is 10020.</description>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>localhost:19888</value>
<description>Default port is 19888.</description>
</property>
<property>
<name>mapreduce.jobhistory.intermediate-done-dir</name>
<value>/mr-history/tmp</value>
<description>Directory where history files are written by MapReduce jobs.</description>
</property>
<property>
<name>mapreduce.jobhistory.done-dir</name>
<value>/mr-history/done</value>
<description>Directory where history files are managed by the MR JobHistory Server.</description>
</property>
纱的site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
<description>MapReduce framework name</description>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>localhost:10020</value>
<description>Default port is 10020.</description>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>localhost:19888</value>
<description>Default port is 19888.</description>
</property>
<property>
<name>mapreduce.jobhistory.intermediate-done-dir</name>
<value>/mr-history/tmp</value>
<description>Directory where history files are written by MapReduce jobs.</description>
</property>
<property>
<name>mapreduce.jobhistory.done-dir</name>
<value>/mr-history/done</value>
<description>Directory where history files are managed by the MR JobHistory Server.</description>
</property>
从站(仅在主站上,在从站中具有localhost)
chadoop3
chadoop4
开始服务
[hadoop@chadoop1 hadoop]$ start-dfs.sh
Starting namenodes on [localhost]
localhost: starting namenode, logging to /opt/hadoop/logs/hadoop-hadoop-
namenode-chadoop1.out
chadoop4: starting datanode, logging to /opt/hadoop/logs/hadoop-hadoop-
datanode-chadoop4.out
chadoop3: starting datanode, logging to /opt/hadoop/logs/hadoop-hadoop-
datanode-chadoop3.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /opt/hadoop/logs/hadoop-
hadoop-secondarynamenode-chadoop1.out
[hadoop@chadoop1 hadoop]$ jps
5603 Jps
5492 SecondaryNameNode
5291 NameNode
[hadoop@chadoop1 hadoop]$ start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /opt/hadoop/logs/yarn-hadoop-
resourcemanager-chadoop1.out
chadoop3: starting nodemanager, logging to /opt/hadoop/logs/yarn-hadoop-
nodemanager-chadoop3.out
chadoop4: starting nodemanager, logging to /opt/hadoop/logs/yarn-hadoop-
nodemanager-chadoop4.out
[hadoop@chadoop1 hadoop]$ jps
5492 SecondaryNameNode
5658 ResourceManager
5914 Jps
5291 NameNode
所有配置都是节点上的rsync(从站除外)
所有配置必须在所有节点上。
话虽这么说,数据节点需要知道网络中NameNode存在的位置,因此如果服务器实际上应该是奴隶,则进程不能在localhost
上。因此,您必须输入实际的主机名。
YARN服务也是如此。
我在jps上看到,不要显示DataNode和NodeManager。
从显示的输出中,您似乎只在主计算机上启动了服务,而不是那些服务存在的两个从服务器。
启动脚本只控制一台机器,而不是集群,jps
只显示本地机器的Java进程
顺便说一句,Apache Ambari使安装和管理Hadoop集群变得更加容易。
以上是关于Hadoop 2.9 MultiNodes的主要内容,如果未能解决你的问题,请参考以下文章
Hive虚拟内存溢出报错:2.9GB of 2.1GB virtual memory used. Killing container.解决办法