HDFS HA之手动高可用故障转移配置自动高可用故障转移配置配置YARN-HA集群
Posted 爱上口袋的天空
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了HDFS HA之手动高可用故障转移配置自动高可用故障转移配置配置YARN-HA集群相关的知识,希望对你有一定的参考价值。
一、HA概述
- 所谓HA(High Available),即高可用(7*24小时不中断服务)。
实现高可用最关键的策略是消除单点故障。HA严格来说应该分成各个组件的HA机制:HDFS的HA和YARN的HA。
Hadoop2.0之前,在HDFS集群中NameNode存在单点故障(SPOF)
NameNode主要在以下两个方面影响HDFS集群
NameNode机器发生意外,如宕机,集群将无法使用,直到管理员重启
NameNode机器需要升级,包括软件、硬件升级,此时集群也将无法使用
HDFS HA功能通过配置Active/Standby两个NameNodes实现在集群中对NameNode的热备来解决上述问题。如果出现故障,如机器崩溃或机器需要升级维护,这时可通过此种方式将NameNode很快的切换到另外一台机器。
二、HDFS-HA工作机制
通过双NameNode消除单点故障
1、HDFS-HA工作要点
1.1、元数据管理方式需要改变
- 内存中各自保存一份元数据
- Edits日志只有Active状态的NameNode节点可以做写操作
- 两个NameNode都可以读取Edits
- 共享的Edits放在一个共享存储中管理(qjournal和NFS两个主流实现)
1.2、需要一个状态管理功能模块
实现了一个zkfailover,常驻在每一个namenode所在的节点,每一个zkfailover负责监控自己所在NameNode节点,利用zk进行状态标识,当需要进行状态切换时,由zkfailover来负责切换,切换时需要防止brain split现象的发生。
1.3、必须保证两个NameNode之间能够ssh无密码登录
1.4、隔离(Fence),即同一时刻仅仅有一个NameNode对外提供服务
三、规划集群
四、HDFS HA集群配置
1、将/opt/module/下的 hadoop-2.7.2拷贝到/opt/module/HA/目录
[kgf@hadoop20 module]$ mkdir HA [kgf@hadoop20 module]$ [kgf@hadoop20 module]$ ll total 8 drwxrwxr-x. 2 kgf kgf 6 Jun 12 08:49 HA drwxr-xr-x. 16 kgf kgf 4096 Jun 3 11:38 hadoop-2.7.2 drwxr-xr-x. 7 kgf kgf 245 Oct 6 2018 jdk1.8.0_191 drwxr-xr-x. 11 kgf kgf 4096 Jun 12 07:15 zookeeper-3.4.10 [kgf@hadoop20 module]$ cp -r hadoop-2.7.2 HA/
2、配置/opt/module/HA/hadoop-2.7.2/etc/hadoop/core-site.xml
<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <!-- Put site-specific property overrides in this file. --> <configuration> <!-- 把两个NameNode)的地址组装成一个集群mycluster --> <property> <name>fs.defaultFS</name> <value>hdfs://mycluster</value> </property> <!-- 指定Hadoop运行时产生文件的存储目录 --> <property> <name>hadoop.tmp.dir</name> <value>/opt/module/HA/hadoop-2.7.2/data/tmp</value> </property> <!-- 声明journalnode服务器存储目录--> <property> <name>dfs.journalnode.edits.dir</name> <value>/opt/module/HA/hadoop-2.7.2/data/tmp/jn</value> </property> </configuration>
3、配置hdfs-site.xml
<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <!-- Put site-specific property overrides in this file. --> <configuration> <!-- 指定HDFS副本的数量 --> <property> <name>dfs.replication</name> <value>1</value> </property> <!-- 完全分布式集群名称 --> <property> <name>dfs.nameservices</name> <value>mycluster</value> </property> <!-- 集群中NameNode节点都有哪些 --> <property> <name>dfs.ha.namenodes.mycluster</name> <value>nn1,nn2</value> </property> <!-- nn1的RPC通信地址 --> <property> <name>dfs.namenode.rpc-address.mycluster.nn1</name> <value>hadoop20:8020</value> </property> <!-- nn2的RPC通信地址 --> <property> <name>dfs.namenode.rpc-address.mycluster.nn2</name> <value>hadoop21:8020</value> </property> <!-- nn1的http通信地址 --> <property> <name>dfs.namenode.http-address.mycluster.nn1</name> <value>hadoop20:50070</value> </property> <!-- nn2的http通信地址 --> <property> <name>dfs.namenode.http-address.mycluster.nn2</name> <value>hadoop21:50070</value> </property> <!-- 指定NameNode元数据在JournalNode上的存放位置 --> <property> <name>dfs.namenode.shared.edits.dir</name> <value>qjournal://hadoop20:8485;hadoop21:8485;hadoop22:8485/mycluster</value> </property> <!-- 关闭权限检查--> <property> <name>dfs.permissions.enable</name> <value>false</value> </property> <!-- 访问代理类:client,mycluster,active配置失败自动切换实现方式--> <property> <name>dfs.client.failover.proxy.provider.mycluster</name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> </property> <!-- 配置隔离机制,即同一时刻只能有一台服务器对外响应 --> <property> <name>dfs.ha.fencing.methods</name> <value>sshfence</value> </property> <!-- 使用隔离机制时需要ssh无秘钥登录--> <property> <name>dfs.ha.fencing.ssh.private-key-files</name> <value>/home/kgf/.ssh/id_rsa</value> </property> </configuration>
4、拷贝配置好的hadoop环境到其他节点,这里使用之前的xsync脚本分发即可
5、清掉之前拷贝集群的/opt/module/HA/hadoop-2.7.2/data和/opt/module/HA/hadoop-2.7.2/logs目录
五、启动HDFS-HA集群
1、在各个JournalNode节点上,输入以下命令启动journalnode服务
[kgf@hadoop20 hadoop-2.7.2]$ sbin/hadoop-daemon.sh start journalnode starting journalnode, logging to /opt/module/HA/hadoop-2.7.2/logs/hadoop-kgf-journalnode-hadoop20.out [kgf@hadoop20 hadoop-2.7.2]$ pwd /opt/module/HA/hadoop-2.7.2 [kgf@hadoop20 hadoop-2.7.2]$
2、在[nn1]上,对其进行格式化,并启动
[kgf@hadoop20 hadoop-2.7.2]$ pwd /opt/module/HA/hadoop-2.7.2 [kgf@hadoop20 hadoop-2.7.2]$ bin/hdfs namenode -format
[kgf@hadoop20 hadoop-2.7.2]$ sbin/hadoop-daemon.sh start namenode starting namenode, logging to /opt/module/HA/hadoop-2.7.2/logs/hadoop-kgf-namenode-hadoop20.out [kgf@hadoop20 hadoop-2.7.2]$
3、在[nn2]上,同步nn1的元数据信息
[kgf@hadoop21 hadoop-2.7.2]$ pwd /opt/module/HA/hadoop-2.7.2 [kgf@hadoop21 hadoop-2.7.2]$ bin/hdfs namenode -bootstrapStandby
4、启动[nn2]
[kgf@hadoop21 hadoop-2.7.2]$ sbin/hadoop-daemon.sh start namenode starting namenode, logging to /opt/module/HA/hadoop-2.7.2/logs/hadoop-kgf-namenode-hadoop21.out [kgf@hadoop21 hadoop-2.7.2]$ pwd /opt/module/HA/hadoop-2.7.2 [kgf@hadoop21 hadoop-2.7.2]$
5、查看web页面显示
6、在[nn1]上,启动所有datanode
[kgf@hadoop20 hadoop-2.7.2]$ sbin/hadoop-daemons.sh start datanode hadoop20: starting datanode, logging to /opt/module/HA/hadoop-2.7.2/logs/hadoop-kgf-datanode-hadoop20.out hadoop21: starting datanode, logging to /opt/module/HA/hadoop-2.7.2/logs/hadoop-kgf-datanode-hadoop21.out hadoop22: starting datanode, logging to /opt/module/HA/hadoop-2.7.2/logs/hadoop-kgf-datanode-hadoop22.out [kgf@hadoop20 hadoop-2.7.2]$ jps 2458 Jps 2203 NameNode 1820 QuorumPeerMain 2076 JournalNode 2381 DataNode [kgf@hadoop20 hadoop-2.7.2]$
7、将[nn1]切换为Active
[kgf@hadoop20 hadoop-2.7.2]$ bin/hdfs haadmin -transitionToActive nn1 [kgf@hadoop20 hadoop-2.7.2]$ pwd /opt/module/HA/hadoop-2.7.2 [kgf@hadoop20 hadoop-2.7.2]$
8、下面我们测试手动转移,我们先将nn1的namenode进程杀掉
[kgf@hadoop20 hadoop-2.7.2]$ jps 2549 Jps 2203 NameNode 1820 QuorumPeerMain 2076 JournalNode 2381 DataNode [kgf@hadoop20 hadoop-2.7.2]$ kill -9 2203 [kgf@hadoop20 hadoop-2.7.2]$
访问不了了,下面我们启动一下nn2试试
可以发现报错了,下面我们先启动一下nn1的namenode:
[kgf@hadoop20 hadoop-2.7.2]$ sbin/hadoop-daemon.sh start namenode starting namenode, logging to /opt/module/HA/hadoop-2.7.2/logs/hadoop-kgf-namenode-hadoop20.out [kgf@hadoop20 hadoop-2.7.2]$ pwd /opt/module/HA/hadoop-2.7.2 [kgf@hadoop20 hadoop-2.7.2]$
可以发现nn1已经启动并且变为standby状态了,下面我们再设置nn2为active状态:
[kgf@hadoop21 hadoop-2.7.2]$ bin/hdfs haadmin -transitionToActive nn2 [kgf@hadoop21 hadoop-2.7.2]$
所以,由上可知这种手动的HA故障转移是有问题的,必须保证两个namenode都正常才行,但是有时候其中一个namenode就是起不来的。
六、自动高可用故障转移配置(推荐使用的)
1、简介
前面学习了使用命令hdfs haadmin -failover手动进行故障转移,在该模式下,即使现役NameNode已经失效,系统也不会自动从现役NameNode转移到待机NameNode,下面学习如何配置部署HA自动进行故障转移。自动故障转移为HDFS部署增加了两个新组件:ZooKeeper和ZKFailoverController(ZKFC)进程,如下图所示。ZooKeeper是维护少量协调数据,通知客户端这些数据的改变和监视客户端故障的高可用服务。HA的自动故障转移依赖于ZooKeeper的以下功能:
- 故障检测:集群中的每个NameNode在ZooKeeper中维护了一个持久会话,如果机器崩溃,ZooKeeper中的会话将终止,ZooKeeper通知另一个NameNode需要触发故障转移。
- 现役NameNode选择:ZooKeeper提供了一个简单的机制用于唯一的选择一个节点为active状态。如果目前现役NameNode崩溃,另一个节点可能从ZooKeeper获得特殊的排外锁以表明它应该成为现役NameNode
2、ZKFC是自动故障转移中的另一个新组件,是ZooKeeper的客户端,也监视和管理NameNode的状态。每个运行NameNode的主机也运行了一个ZKFC进程,ZKFC负责如下:
- 健康监测:ZKFC使用一个健康检查命令定期地ping与之在相同主机的NameNode,只要该NameNode及时地回复健康状态,ZKFC认为该节点是健康的。如果该节点崩溃,冻结或进入不健康状态,健康监测器标识该节点为非健康的。
ZooKeeper会话管理:当本地NameNode是健康的,ZKFC保持一个在ZooKeeper中打开的会话。如果本地NameNode处于active状态,ZKFC也保持一个特殊的znode锁,该锁使用了ZooKeeper对短暂节点的支持,如果会话终止,锁节点将自动删除。
基于ZooKeeper的选择:如果本地NameNode是健康的,且ZKFC发现没有其它的节点当前持有znode锁,它将为自己获取该锁。如果成功,则它已经赢得了选择,并负责运行故障转移进程以使它的本地NameNode为Active。故障转移进程与前面描述的手动故障转移相似,首先如果必要保护之前的现役NameNode,然后本地NameNode转换为Active状态。
3、在上面手动转移的基础上配置自动化
3.1、在hdfs-site.xml中增加
<!-- 开启自动化故障转移--> <property> <name>dfs.ha.automatic-failover.enabled</name> <value>true</value> </property>
3.2、在core-site.xml文件中增加
<!-- 配置zookeeper连接--> <property> <name>ha.zookeeper.quorum</name> <value>hadoop20:2181,hadoop21:2181,hadoop22:2181</value> </property>
3.3、将修改的配置分发到集群的其他两台机器
4、启动hadoop
4.1、关闭之前启动的所有HDFS服务
[kgf@hadoop20 hadoop-2.7.2]$ pwd /opt/module/HA/hadoop-2.7.2 [kgf@hadoop20 hadoop-2.7.2]$ sbin/stop-dfs.sh Stopping namenodes on [hadoop20 hadoop21] hadoop20: stopping namenode hadoop21: stopping namenode hadoop20: stopping datanode hadoop22: stopping datanode hadoop21: stopping datanode Stopping journal nodes [hadoop20 hadoop21 hadoop22] hadoop20: stopping journalnode hadoop22: stopping journalnode hadoop21: stopping journalnode Stopping ZK Failover Controllers on NN hosts [hadoop20 hadoop21] hadoop20: no zkfc to stop hadoop21: no zkfc to stop [kgf@hadoop20 hadoop-2.7.2]$
4.2、启动之前配置Zookeeper集群(下面查看状态的)
[kgf@hadoop20 hadoop-2.7.2]$ /opt/module/zookeeper-3.4.10/bin/zkServer.sh status ZooKeeper JMX enabled by default Using config: /opt/module/zookeeper-3.4.10/bin/../conf/zoo.cfg Mode: follower [kgf@hadoop20 hadoop-2.7.2]$
4.3、初始化HA在Zookeeper中状态(在hadoop20上执行一次即可)
bin/hdfs zkfc -formatZK
进入zookeeper查看结果:
[kgf@hadoop20 zookeeper-3.4.10]$ bin/zkCli.sh [zk: localhost:2181(CONNECTED) 0] ls / [zookeeper, hadoop-ha] [zk: localhost:2181(CONNECTED) 1] ls /hadoop-ha [mycluster] [zk: localhost:2181(CONNECTED) 2]
4.4、启动HDFS服务
[kgf@hadoop20 hadoop-2.7.2]$ sbin/start-dfs.sh Starting namenodes on [hadoop20 hadoop21] hadoop21: starting namenode, logging to /opt/module/HA/hadoop-2.7.2/logs/hadoop-kgf-namenode-hadoop21.out hadoop20: starting namenode, logging to /opt/module/HA/hadoop-2.7.2/logs/hadoop-kgf-namenode-hadoop20.out hadoop20: starting datanode, logging to /opt/module/HA/hadoop-2.7.2/logs/hadoop-kgf-datanode-hadoop20.out hadoop21: starting datanode, logging to /opt/module/HA/hadoop-2.7.2/logs/hadoop-kgf-datanode-hadoop21.out hadoop22: starting datanode, logging to /opt/module/HA/hadoop-2.7.2/logs/hadoop-kgf-datanode-hadoop22.out Starting journal nodes [hadoop20 hadoop21 hadoop22] hadoop20: starting journalnode, logging to /opt/module/HA/hadoop-2.7.2/logs/hadoop-kgf-journalnode-hadoop20.out hadoop22: starting journalnode, logging to /opt/module/HA/hadoop-2.7.2/logs/hadoop-kgf-journalnode-hadoop22.out hadoop21: starting journalnode, logging to /opt/module/HA/hadoop-2.7.2/logs/hadoop-kgf-journalnode-hadoop21.out Starting ZK Failover Controllers on NN hosts [hadoop20 hadoop21] hadoop20: starting zkfc, logging to /opt/module/HA/hadoop-2.7.2/logs/hadoop-kgf-zkfc-hadoop20.out hadoop21: starting zkfc, logging to /opt/module/HA/hadoop-2.7.2/logs/hadoop-kgf-zkfc-hadoop21.out [kgf@hadoop20 hadoop-2.7.2]$ jps 4320 NameNode 4432 DataNode 4825 DFSZKFailoverController 1820 QuorumPeerMain 4636 JournalNode 4877 Jps [kgf@hadoop20 hadoop-2.7.2]$
4.5、查看两个namenode节点情况
下面我们直接将hadoop21节点的namenode进程杀掉,看看是否能够完成自动故障转移
[kgf@hadoop21 hadoop-2.7.2]$ jps 4914 Jps 1739 QuorumPeerMain [kgf@hadoop21 hadoop-2.7.2]$ jps 5289 DFSZKFailoverController 1739 QuorumPeerMain 4988 NameNode 5069 DataNode 5358 Jps 5167 JournalNode [kgf@hadoop21 hadoop-2.7.2]$ kill -9 4988 [kgf@hadoop21 hadoop-2.7.2]$ jps 5398 Jps 5289 DFSZKFailoverController 1739 QuorumPeerMain 5069 DataNode 5167 JournalNode [kgf@hadoop21 hadoop-2.7.2]$
出现问题:
问题出在当ActiveNameNode崩溃挂掉后,StandbyNameNode不能选举争取到Active,会一直处于Standby状态
解决方法:安装插件psmisc
#yum -y install psmisc
注意一下,每个nameNode都需要装psmisc插件
最后成功转移
七、配置YARN-HA集群
1、规划集群
2、在上面的基础上配置 yarn-site.xml
<?xml version="1.0"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <configuration> <!-- Reducer获取数据的方式 --> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <!--启用resourcemanager ha--> <property> <name>yarn.resourcemanager.ha.enabled</name> <value>true</value> </property> <!--声明两台resourcemanager的地址--> <property> <name>yarn.resourcemanager.cluster-id</name> <value>cluster-yarn1</value> </property> <property> <name>yarn.resourcemanager.ha.rm-ids</name> <value>rm1,rm2</value> </property> <property> <name>yarn.resourcemanager.hostname.rm1</name> <value>hadoop20</value> </property> <property> <name>yarn.resourcemanager.hostname.rm2</name> <value>hadoop21</value> </property> <!--指定zookeeper集群的地址--> <property> <name>yarn.resourcemanager.zk-address</name> <value>hadoop20:2181,hadoop21:2181,hadoop22:2181</value> </property> <!--启用自动恢复--> <property> <name>yarn.resourcemanager.recovery.enabled</name> <value>true</value> </property> <!--指定resourcemanager的状态信息存储在zookeeper集群--> <property> <name>yarn.resourcemanager.store.class</name> <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value> </property> </configuration>
3、将yarn-site.xml同步更新其他节点的配置信息
[kgf@hadoop20 logs]$ cd /opt/module/HA/hadoop-2.7.2/ [kgf@hadoop20 hadoop-2.7.2]$ xsync etc/hadoop/yarn-site.xml fname=yarn-site.xml pdir=/opt/module/HA/hadoop-2.7.2/etc/hadoop ----------hadoop21-------- sending incremental file list yarn-site.xml sent 1430 bytes received 43 bytes 2946.00 bytes/sec total size is 2047 speedup is 1.39 ------hadoop22-------- sending incremental file list yarn-site.xml sent 1430 bytes received 43 bytes 2946.00 bytes/sec total size is 2047 speedup is 1.39 [kgf@hadoop20 hadoop-2.7.2]$ [kgf@hadoop20 hadoop-2.7.2]$
4、启动YARN
4.1、在hadoop20中执行
[kgf@hadoop20 hadoop-2.7.2]$ sbin/start-yarn.sh starting yarn daemons starting resourcemanager, logging to /opt/module/HA/hadoop-2.7.2/logs/yarn-kgf-resourcemanager-hadoop20.out hadoop22: starting nodemanager, logging to /opt/module/HA/hadoop-2.7.2/logs/yarn-kgf-nodemanager-hadoop22.out hadoop21: starting nodemanager, logging to /opt/module/HA/hadoop-2.7.2/logs/yarn-kgf-nodemanager-hadoop21.out hadoop20: starting nodemanager, logging to /opt/module/HA/hadoop-2.7.2/logs/yarn-kgf-nodemanager-hadoop20.out [kgf@hadoop20 hadoop-2.7.2]$
4.2、在hadoop21中执行:
[kgf@hadoop21 hadoop-2.7.2]$ sbin/yarn-daemon.sh start resourcemanager starting resourcemanager, logging to /opt/module/HA/hadoop-2.7.2/logs/yarn-kgf-resourcemanager-hadoop21.out [kgf@hadoop21 hadoop-2.7.2]$ jps 8482 ResourceManager 8342 NodeManager 7864 NameNode 8536 Jps 5289 DFSZKFailoverController 1739 QuorumPeerMain 5069 DataNode 5167 JournalNode [kgf@hadoop21 hadoop-2.7.2]$
4.3、查看服务状态
说明目前hadoop20节点的yarn是active的,hadoop21节点是standby的,我们将hadoop20的节点杀掉试试:
[kgf@hadoop20 hadoop-2.7.2]$ jps 4320 NameNode 4432 DataNode 6210 NodeManager 6100 ResourceManager 4825 DFSZKFailoverController 6523 Jps 1820 QuorumPeerMain 4636 JournalNode [kgf@hadoop20 hadoop-2.7.2]$ kill -9 6100 [kgf@hadoop20 hadoop-2.7.2]$
可以发现hadoop21可以访问了,变成active了
以上是关于HDFS HA之手动高可用故障转移配置自动高可用故障转移配置配置YARN-HA集群的主要内容,如果未能解决你的问题,请参考以下文章