基于ZooKeeper的Hadoop HA集群搭建
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了基于ZooKeeper的Hadoop HA集群搭建相关的知识,希望对你有一定的参考价值。
集群的规划
Zookeeper集群:
192.168.142.12 (bigdata12)
192.168.142.13 (bigdata13)
192.168.142.14 (bigdata14)
Hadoop集群:
192.168.142.12 (bigdata12) NameNode1 ResourceManager1 Journalnode
192.168.142.13 (bigdata13) NameNode2 ResourceManager2 Journalnode
192.168.142.14 (bigdata14) DataNode1 NodeManager1
192.168.142.15 (bigdata15) DataNode2 NodeManager2
1、准备工作:
(1)、关闭防火墙:
查看防火墙的状态: systemctl status firewalld.service
关闭防火墙: systemctl stop firewalld.service
禁用防火墙(永久):systemctl disable firewalld.service
(2)、安装JDK及配置环境变量
tar -zxvf jdk-8u144-linux-x64.tar.gz -C ~/training/
设置环境变量:vi ~/.bash_profile
JAVA_HOME=/root/training/jdk1.8.0_144
export JAVA_HOME
PATH=$JAVA_HOME/bin:$PATH
export PATH
生效环境变量: source ~/.bash_profile
(3)、安装Hadoop及设置环境变量:
解压:tar -zxvf hadoop-2.7.3.tar.gz -C ~/training/
设置环境变量:
vi ~/.bash_profile
HADOOP_HOME=/root/training/hadoop-2.7.3
export HADOOP_HOME
PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
export PATH
生效环境变量:
source ~/.bash_profile
(4)、设置免密码登录:
a、生成密钥:
ssh-keygen -t rsa
(存储在~/.ssh目录)
b、分发公钥:
ssh-copy-id -i /root/.ssh/id_rsa.pub [email protected]
c、验证:
ssh bigdata12
(5)、配置主机名:/etc/hosts文件
vi /etc/hosts
192.168.157.11 bigdata11
2、安装配置ZooKeeper集群:参考“ZooKeeper安装说明”
3、配置Hadoop集群:(在bigdata12上安装)
(1)修改hadoop-env.sh文件:(在bigdata12上执行)
export JAVA_HOME=/root/training/jdk1.8.0_144
(2)修改core-site.xml文件:(在bigdata12上执行)
<configuration>
<!-- 指定hdfs的nameservice为ns1 -->
<property>
<name>fs.defaultFS</name>
<value>hdfs://ns1</value>
</property>
<!-- 指定hadoop临时目录 -->
<property>
<name>hadoop.tmp.dir</name>
<!-- /tmp目录事先存在 -->
<value>/root/training/hadoop-2.7.3/tmp</value>
</property>
<!-- 指定zookeeper地址 -->
<property>
<name>ha.zookeeper.quorum</name>
<value>bigdata12:2181,bigdata13:2181,bigdata14:2181</value>
</property>
</configuration>
(3)修改hdfs-site.xml(在bigdata12上执行)
<configuration>
<!--指定hdfs的nameservice为ns1,需要和core-site.xml中的保持一致 -->
<property>
<name>dfs.nameservices</name>
<value>ns1</value>
</property>
<!-- ns1下面有两个NameNode,分别是nn1,nn2 -->
<property>
<name>dfs.ha.namenodes.ns1</name>
<value>nn1,nn2</value>
</property>
<!-- nn1的RPC通信地址 -->
<property>
<name>dfs.namenode.rpc-address.ns1.nn1</name>
<value>bigdata12:9000</value>
</property>
<!-- nn1的http通信地址 -->
<property>
<name>dfs.namenode.http-address.ns1.nn1</name>
<value>bigdata12:50070</value>
</property>
<!-- nn2的RPC通信地址 -->
<property>
<name>dfs.namenode.rpc-address.ns1.nn2</name>
<value>bigdata13:9000</value>
</property>
<!-- nn2的http通信地址 -->
<property>
<name>dfs.namenode.http-address.ns1.nn2</name>
<value>bigdata13:50070</value>
</property>
<!-- 指定NameNode的日志在JournalNode上的存放位置 -->
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://bigdata12:8485;bigdata13:8485;/ns1</value>
</property>
<!-- 指定JournalNode在本地磁盘存放数据的位置 -->
<property>
<name>dfs.journalnode.edits.dir</name>
<!-- /journal目录事先存在 -->
<value>/root/training/hadoop-2.7.3/journal</value>
</property>
<!-- 开启NameNode失败自动切换 -->
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<!-- 配置失败自动切换实现方式 -->
<property>
<name>dfs.client.failover.proxy.provider.ns1</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<!-- 配置隔离机制方法,多个机制用换行分割,即每个机制暂用一行-->
<!--为什么HA的实现,需要配置隔离机制??
如果没有隔离机制,会造成脑裂的问题:由于某种原因(FailOverController与NameNode通信出现问题,FailOverController的到的是错误信息),造成存在多个active的NameNode,这时DataNode就会发生脑裂问题,不知道谁是真正的NameNode。
-->
<property>
<name>dfs.ha.fencing.methods</name>
<value>
sshfence
shell(/bin/true)
</value>
</property>
<!-- 使用sshfence隔离机制时需要ssh免登陆 -->
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/root/.ssh/id_rsa</value>
</property>
<!-- 配置sshfence隔离机制超时时间:毫秒,超过30秒切换失败 -->
<property>
<name>dfs.ha.fencing.ssh.connect-timeout</name>
<value>30000</value>
</property>
</configuration>
(4)、修改mapred-site.xml文件(在bigdata12上执行)
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
(5)、修改yarn-site.xml文件(在bigdata12上执行)
<configuration>
<!-- 开启RM高可靠 -->
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<!-- 指定RM的cluster id -->
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>yrc</value>
</property>
<!-- 指定RM的名字 -->
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>
<!-- 分别指定RM的地址 -->
<property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>bigdata12</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>bigdata13</value>
</property>
<!-- 指定zk集群地址 -->
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>bigdata12:2181,bigdata13:2181,bigdata14:2181</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
(6)、修改slaves文件(/root/training/hadoop-2.7.3/etc/hadoop),设置从节点
bigdata14
bigdata15
(7)、创建目录(在bigdata12上执行)
/root/training/hadoop-2.7.3/tmp
/root/training/hadoop-2.7.3/journal
(8)、将配置好的hadoop拷贝到其他节点
scp -r /root/training/hadoop-2.7.3/ [email protected]:/root/training/
scp -r /root/training/hadoop-2.7.3/ [email protected]:/root/training/
scp -r /root/training/hadoop-2.7.3/ [email protected]:/root/training/
(8)、启动ZooKeeper集群
zkServer.sh start
(9)、单独启动启动journalnode(bigdata12和bigdata13)
hadoop-daemon.sh start journalnode
(10)NameNode节点格式化HDFS(在bigdata12上执行)
hdfs namenode -format
(11)拷贝bigdata12的dfs目录到bigdata13
/root/training/hadoop-2.7.3/tmp/dfs拷贝到bigdata13的/root/training/hadoop-2.7.3/tmp
scp -r dfs/ [email protected]:/root/training/hadoop-2.7.3/tmp
(12)、格式化ZooKeeper(在bigdata12上执行)
hdfs zkfc -formatZK
日志:INFO ha.ActiveStandbyElector: Successfully created /hadoop-ha/ns1 in ZK.
(13)、启动Hadoop集群(在bigdata12或 bigdata13上执行)
start-all.sh
(14)、单独启动ResourceManager(bigdata12或 bigdata13上执行,取决于待启动)
yarn-daemon.sh start resourcemanager
以上是关于基于ZooKeeper的Hadoop HA集群搭建的主要内容,如果未能解决你的问题,请参考以下文章
基于Docker的Zookeeper+Hadoop(HA)+hbase(HA)搭建