HDFS HA之手动高可用故障转移配置自动高可用故障转移配置配置YARN-HA集群

Posted 爱上口袋的天空

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了HDFS HA之手动高可用故障转移配置自动高可用故障转移配置配置YARN-HA集群相关的知识,希望对你有一定的参考价值。

一、HA概述

  • 所谓HA(High Available),即高可用(7*24小时不中断服务)。
  • 实现高可用最关键的策略是消除单点故障。HA严格来说应该分成各个组件的HA机制:HDFS的HAYARN的HA

  • Hadoop2.0之前,在HDFS集群中NameNode存在单点故障(SPOF)

  • NameNode主要在以下两个方面影响HDFS集群

    • NameNode机器发生意外,如宕机,集群将无法使用,直到管理员重启

    • NameNode机器需要升级,包括软件、硬件升级,此时集群也将无法使用

    • HDFS HA功能通过配置Active/Standby两个NameNodes实现在集群中对NameNode的热备来解决上述问题。如果出现故障,如机器崩溃或机器需要升级维护,这时可通过此种方式将NameNode很快的切换到另外一台机器。

二、HDFS-HA工作机制

通过双NameNode消除单点故障 

1、HDFS-HA工作要点

1.1、元数据管理方式需要改变

  • 内存中各自保存一份元数据
  • Edits日志只有Active状态的NameNode节点可以做写操作
  • 两个NameNode都可以读取Edits
  • 共享的Edits放在一个共享存储中管理(qjournal和NFS两个主流实现)

1.2、需要一个状态管理功能模块

        实现了一个zkfailover,常驻在每一个namenode所在的节点,每一个zkfailover负责监控自己所在NameNode节点,利用zk进行状态标识,当需要进行状态切换时,由zkfailover来负责切换,切换时需要防止brain split现象的发生。

1.3、必须保证两个NameNode之间能够ssh无密码登录

1.4、隔离(Fence),即同一时刻仅仅有一个NameNode对外提供服务

三、规划集群 

四、HDFS HA集群配置

1、将/opt/module/下的 hadoop-2.7.2拷贝到/opt/module/HA/目录

[kgf@hadoop20 module]$ mkdir HA
[kgf@hadoop20 module]$
[kgf@hadoop20 module]$ ll
total 8
drwxrwxr-x.  2 kgf kgf    6 Jun 12 08:49 HA
drwxr-xr-x. 16 kgf kgf 4096 Jun  3 11:38 hadoop-2.7.2
drwxr-xr-x.  7 kgf kgf  245 Oct  6  2018 jdk1.8.0_191
drwxr-xr-x. 11 kgf kgf 4096 Jun 12 07:15 zookeeper-3.4.10
[kgf@hadoop20 module]$ cp -r hadoop-2.7.2 HA/

2、配置/opt/module/HA/hadoop-2.7.2/etc/hadoop/core-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
	<!-- 把两个NameNode)的地址组装成一个集群mycluster -->
	<property>
	  <name>fs.defaultFS</name>
	  <value>hdfs://mycluster</value>
	</property>

	<!-- 指定Hadoop运行时产生文件的存储目录 -->
	<property>
	   <name>hadoop.tmp.dir</name>
	   <value>/opt/module/HA/hadoop-2.7.2/data/tmp</value>
	</property>
       
	<!-- 声明journalnode服务器存储目录-->
	<property>
	  <name>dfs.journalnode.edits.dir</name>
	  <value>/opt/module/HA/hadoop-2.7.2/data/tmp/jn</value>
	</property>   
</configuration>

3、配置hdfs-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
	<!-- 指定HDFS副本的数量 -->
	<property>
	   <name>dfs.replication</name>
	   <value>1</value>
	</property>
    
	<!-- 完全分布式集群名称 -->
	<property>
		<name>dfs.nameservices</name>
		<value>mycluster</value>
	</property>
	<!-- 集群中NameNode节点都有哪些 -->
	<property>
		<name>dfs.ha.namenodes.mycluster</name>
		<value>nn1,nn2</value>
	</property>
	
	<!-- nn1的RPC通信地址 -->
	<property>
	  <name>dfs.namenode.rpc-address.mycluster.nn1</name>
	  <value>hadoop20:8020</value>
	</property>
	<!-- nn2的RPC通信地址 -->
	<property>
	  <name>dfs.namenode.rpc-address.mycluster.nn2</name>
	  <value>hadoop21:8020</value>
	</property>
	<!-- nn1的http通信地址 -->
	<property>
	  <name>dfs.namenode.http-address.mycluster.nn1</name>
	  <value>hadoop20:50070</value>
	</property>
	<!-- nn2的http通信地址 -->
	<property>
	  <name>dfs.namenode.http-address.mycluster.nn2</name>
	  <value>hadoop21:50070</value>
	</property>
	<!-- 指定NameNode元数据在JournalNode上的存放位置 -->
	<property>
	  <name>dfs.namenode.shared.edits.dir</name>
	  <value>qjournal://hadoop20:8485;hadoop21:8485;hadoop22:8485/mycluster</value>
	</property>

	<!-- 关闭权限检查-->
	<property>
		<name>dfs.permissions.enable</name>
		<value>false</value>
	</property>


	<!-- 访问代理类:client,mycluster,active配置失败自动切换实现方式-->
	<property>
	  <name>dfs.client.failover.proxy.provider.mycluster</name>
	  <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
	</property>
	
	<!-- 配置隔离机制,即同一时刻只能有一台服务器对外响应 -->
	<property>
      <name>dfs.ha.fencing.methods</name>
      <value>sshfence</value>
    </property>

	<!-- 使用隔离机制时需要ssh无秘钥登录-->
    <property>
      <name>dfs.ha.fencing.ssh.private-key-files</name>
      <value>/home/kgf/.ssh/id_rsa</value>
    </property>
</configuration>

4、拷贝配置好的hadoop环境到其他节点,这里使用之前的xsync脚本分发即可

5、清掉之前拷贝集群的/opt/module/HA/hadoop-2.7.2/data和/opt/module/HA/hadoop-2.7.2/logs目录

五、启动HDFS-HA集群

1、在各个JournalNode节点上,输入以下命令启动journalnode服务

[kgf@hadoop20 hadoop-2.7.2]$ sbin/hadoop-daemon.sh start journalnode
starting journalnode, logging to /opt/module/HA/hadoop-2.7.2/logs/hadoop-kgf-journalnode-hadoop20.out
[kgf@hadoop20 hadoop-2.7.2]$ pwd
/opt/module/HA/hadoop-2.7.2
[kgf@hadoop20 hadoop-2.7.2]$

2、在[nn1]上,对其进行格式化,并启动

[kgf@hadoop20 hadoop-2.7.2]$ pwd
/opt/module/HA/hadoop-2.7.2
[kgf@hadoop20 hadoop-2.7.2]$ bin/hdfs namenode -format
[kgf@hadoop20 hadoop-2.7.2]$ sbin/hadoop-daemon.sh start namenode
starting namenode, logging to /opt/module/HA/hadoop-2.7.2/logs/hadoop-kgf-namenode-hadoop20.out
[kgf@hadoop20 hadoop-2.7.2]$

3、在[nn2]上,同步nn1的元数据信息

[kgf@hadoop21 hadoop-2.7.2]$ pwd
/opt/module/HA/hadoop-2.7.2
[kgf@hadoop21 hadoop-2.7.2]$ bin/hdfs namenode -bootstrapStandby

4、启动[nn2]

[kgf@hadoop21 hadoop-2.7.2]$ sbin/hadoop-daemon.sh start namenode
starting namenode, logging to /opt/module/HA/hadoop-2.7.2/logs/hadoop-kgf-namenode-hadoop21.out
[kgf@hadoop21 hadoop-2.7.2]$ pwd
/opt/module/HA/hadoop-2.7.2
[kgf@hadoop21 hadoop-2.7.2]$

5、查看web页面显示

 6、在[nn1]上,启动所有datanode

[kgf@hadoop20 hadoop-2.7.2]$ sbin/hadoop-daemons.sh start datanode
hadoop20: starting datanode, logging to /opt/module/HA/hadoop-2.7.2/logs/hadoop-kgf-datanode-hadoop20.out
hadoop21: starting datanode, logging to /opt/module/HA/hadoop-2.7.2/logs/hadoop-kgf-datanode-hadoop21.out
hadoop22: starting datanode, logging to /opt/module/HA/hadoop-2.7.2/logs/hadoop-kgf-datanode-hadoop22.out
[kgf@hadoop20 hadoop-2.7.2]$ jps
2458 Jps
2203 NameNode
1820 QuorumPeerMain
2076 JournalNode
2381 DataNode
[kgf@hadoop20 hadoop-2.7.2]$

7、将[nn1]切换为Active

[kgf@hadoop20 hadoop-2.7.2]$ bin/hdfs haadmin -transitionToActive nn1
[kgf@hadoop20 hadoop-2.7.2]$ pwd
/opt/module/HA/hadoop-2.7.2
[kgf@hadoop20 hadoop-2.7.2]$

8、下面我们测试手动转移,我们先将nn1的namenode进程杀掉

[kgf@hadoop20 hadoop-2.7.2]$ jps
2549 Jps
2203 NameNode
1820 QuorumPeerMain
2076 JournalNode
2381 DataNode
[kgf@hadoop20 hadoop-2.7.2]$ kill -9 2203
[kgf@hadoop20 hadoop-2.7.2]$

访问不了了,下面我们启动一下nn2试试

 可以发现报错了,下面我们先启动一下nn1的namenode:

[kgf@hadoop20 hadoop-2.7.2]$ sbin/hadoop-daemon.sh start namenode
starting namenode, logging to /opt/module/HA/hadoop-2.7.2/logs/hadoop-kgf-namenode-hadoop20.out
[kgf@hadoop20 hadoop-2.7.2]$ pwd
/opt/module/HA/hadoop-2.7.2
[kgf@hadoop20 hadoop-2.7.2]$

 可以发现nn1已经启动并且变为standby状态了,下面我们再设置nn2为active状态:

[kgf@hadoop21 hadoop-2.7.2]$ bin/hdfs haadmin -transitionToActive nn2
[kgf@hadoop21 hadoop-2.7.2]$

所以,由上可知这种手动的HA故障转移是有问题的,必须保证两个namenode都正常才行,但是有时候其中一个namenode就是起不来的。 

六、自动高可用故障转移配置(推荐使用的)

 1、简介

        前面学习了使用命令hdfs haadmin -failover手动进行故障转移,在该模式下,即使现役NameNode已经失效,系统也不会自动从现役NameNode转移到待机NameNode,下面学习如何配置部署HA自动进行故障转移。自动故障转移为HDFS部署增加了两个新组件:ZooKeeper和ZKFailoverController(ZKFC)进程,如下图所示。ZooKeeper是维护少量协调数据,通知客户端这些数据的改变和监视客户端故障的高可用服务。HA的自动故障转移依赖于ZooKeeper的以下功能:

  • 故障检测:集群中的每个NameNode在ZooKeeper中维护了一个持久会话,如果机器崩溃,ZooKeeper中的会话将终止,ZooKeeper通知另一个NameNode需要触发故障转移。
  • 现役NameNode选择:ZooKeeper提供了一个简单的机制用于唯一的选择一个节点为active状态。如果目前现役NameNode崩溃,另一个节点可能从ZooKeeper获得特殊的排外锁以表明它应该成为现役NameNode

2、ZKFC是自动故障转移中的另一个新组件,是ZooKeeper的客户端,也监视和管理NameNode的状态。每个运行NameNode的主机也运行了一个ZKFC进程,ZKFC负责如下:

  • 健康监测:ZKFC使用一个健康检查命令定期地ping与之在相同主机的NameNode,只要该NameNode及时地回复健康状态,ZKFC认为该节点是健康的。如果该节点崩溃,冻结或进入不健康状态,健康监测器标识该节点为非健康的。
  • ZooKeeper会话管理:当本地NameNode是健康的,ZKFC保持一个在ZooKeeper中打开的会话。如果本地NameNode处于active状态,ZKFC也保持一个特殊的znode锁,该锁使用了ZooKeeper对短暂节点的支持,如果会话终止,锁节点将自动删除。

  • 基于ZooKeeper的选择:如果本地NameNode是健康的,且ZKFC发现没有其它的节点当前持有znode锁,它将为自己获取该锁。如果成功,则它已经赢得了选择,并负责运行故障转移进程以使它的本地NameNode为Active。故障转移进程与前面描述的手动故障转移相似,首先如果必要保护之前的现役NameNode,然后本地NameNode转换为Active状态。

3、在上面手动转移的基础上配置自动化

3.1、在hdfs-site.xml中增加

<!-- 开启自动化故障转移-->
	<property>
		<name>dfs.ha.automatic-failover.enabled</name>
		<value>true</value>
	</property>

3.2、在core-site.xml文件中增加

<!-- 配置zookeeper连接-->
	<property>
		<name>ha.zookeeper.quorum</name>
		<value>hadoop20:2181,hadoop21:2181,hadoop22:2181</value>
	</property>

3.3、将修改的配置分发到集群的其他两台机器

4、启动hadoop

4.1、关闭之前启动的所有HDFS服务

[kgf@hadoop20 hadoop-2.7.2]$ pwd
/opt/module/HA/hadoop-2.7.2
[kgf@hadoop20 hadoop-2.7.2]$ sbin/stop-dfs.sh
Stopping namenodes on [hadoop20 hadoop21]
hadoop20: stopping namenode
hadoop21: stopping namenode
hadoop20: stopping datanode
hadoop22: stopping datanode
hadoop21: stopping datanode
Stopping journal nodes [hadoop20 hadoop21 hadoop22]
hadoop20: stopping journalnode
hadoop22: stopping journalnode
hadoop21: stopping journalnode
Stopping ZK Failover Controllers on NN hosts [hadoop20 hadoop21]
hadoop20: no zkfc to stop
hadoop21: no zkfc to stop
[kgf@hadoop20 hadoop-2.7.2]$

4.2、启动之前配置Zookeeper集群(下面查看状态的)

[kgf@hadoop20 hadoop-2.7.2]$ /opt/module/zookeeper-3.4.10/bin/zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /opt/module/zookeeper-3.4.10/bin/../conf/zoo.cfg
Mode: follower
[kgf@hadoop20 hadoop-2.7.2]$

4.3、初始化HA在Zookeeper中状态(在hadoop20上执行一次即可)

bin/hdfs zkfc -formatZK

进入zookeeper查看结果:

[kgf@hadoop20 zookeeper-3.4.10]$ bin/zkCli.sh
[zk: localhost:2181(CONNECTED) 0] ls /
[zookeeper, hadoop-ha]
[zk: localhost:2181(CONNECTED) 1] ls /hadoop-ha
[mycluster]
[zk: localhost:2181(CONNECTED) 2]

4.4、启动HDFS服务

[kgf@hadoop20 hadoop-2.7.2]$ sbin/start-dfs.sh
Starting namenodes on [hadoop20 hadoop21]
hadoop21: starting namenode, logging to /opt/module/HA/hadoop-2.7.2/logs/hadoop-kgf-namenode-hadoop21.out
hadoop20: starting namenode, logging to /opt/module/HA/hadoop-2.7.2/logs/hadoop-kgf-namenode-hadoop20.out
hadoop20: starting datanode, logging to /opt/module/HA/hadoop-2.7.2/logs/hadoop-kgf-datanode-hadoop20.out
hadoop21: starting datanode, logging to /opt/module/HA/hadoop-2.7.2/logs/hadoop-kgf-datanode-hadoop21.out
hadoop22: starting datanode, logging to /opt/module/HA/hadoop-2.7.2/logs/hadoop-kgf-datanode-hadoop22.out
Starting journal nodes [hadoop20 hadoop21 hadoop22]
hadoop20: starting journalnode, logging to /opt/module/HA/hadoop-2.7.2/logs/hadoop-kgf-journalnode-hadoop20.out
hadoop22: starting journalnode, logging to /opt/module/HA/hadoop-2.7.2/logs/hadoop-kgf-journalnode-hadoop22.out
hadoop21: starting journalnode, logging to /opt/module/HA/hadoop-2.7.2/logs/hadoop-kgf-journalnode-hadoop21.out
Starting ZK Failover Controllers on NN hosts [hadoop20 hadoop21]
hadoop20: starting zkfc, logging to /opt/module/HA/hadoop-2.7.2/logs/hadoop-kgf-zkfc-hadoop20.out
hadoop21: starting zkfc, logging to /opt/module/HA/hadoop-2.7.2/logs/hadoop-kgf-zkfc-hadoop21.out
[kgf@hadoop20 hadoop-2.7.2]$ jps
4320 NameNode
4432 DataNode
4825 DFSZKFailoverController
1820 QuorumPeerMain
4636 JournalNode
4877 Jps
[kgf@hadoop20 hadoop-2.7.2]$

4.5、查看两个namenode节点情况

下面我们直接将hadoop21节点的namenode进程杀掉,看看是否能够完成自动故障转移

[kgf@hadoop21 hadoop-2.7.2]$ jps
4914 Jps
1739 QuorumPeerMain
[kgf@hadoop21 hadoop-2.7.2]$ jps
5289 DFSZKFailoverController
1739 QuorumPeerMain
4988 NameNode
5069 DataNode
5358 Jps
5167 JournalNode
[kgf@hadoop21 hadoop-2.7.2]$ kill -9 4988
[kgf@hadoop21 hadoop-2.7.2]$ jps
5398 Jps
5289 DFSZKFailoverController
1739 QuorumPeerMain
5069 DataNode
5167 JournalNode
[kgf@hadoop21 hadoop-2.7.2]$

 出现问题:

问题出在当ActiveNameNode崩溃挂掉后,StandbyNameNode不能选举争取到Active,会一直处于Standby状态

解决方法:安装插件psmisc

#yum -y install psmisc

 注意一下,每个nameNode都需要装psmisc插件

最后成功转移 

七、配置YARN-HA集群

 1、规划集群

2、在上面的基础上配置 yarn-site.xml

<?xml version="1.0"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->
<configuration>
   <!-- Reducer获取数据的方式 -->
   <property>
	<name>yarn.nodemanager.aux-services</name>
	<value>mapreduce_shuffle</value>
   </property>

	<!--启用resourcemanager ha-->
    <property>
        <name>yarn.resourcemanager.ha.enabled</name>
        <value>true</value>
    </property>
 
    <!--声明两台resourcemanager的地址-->
    <property>
        <name>yarn.resourcemanager.cluster-id</name>
        <value>cluster-yarn1</value>
    </property>

    <property>
        <name>yarn.resourcemanager.ha.rm-ids</name>
        <value>rm1,rm2</value>
    </property>

    <property>
        <name>yarn.resourcemanager.hostname.rm1</name>
        <value>hadoop20</value>
    </property>

    <property>
        <name>yarn.resourcemanager.hostname.rm2</name>
        <value>hadoop21</value>
    </property>
 
    <!--指定zookeeper集群的地址--> 
    <property>
        <name>yarn.resourcemanager.zk-address</name>
        <value>hadoop20:2181,hadoop21:2181,hadoop22:2181</value>
    </property>

    <!--启用自动恢复--> 
    <property>
        <name>yarn.resourcemanager.recovery.enabled</name>
        <value>true</value>
    </property>
 
    <!--指定resourcemanager的状态信息存储在zookeeper集群--> 
    <property>
        <name>yarn.resourcemanager.store.class</name>     
		<value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
	</property>

</configuration>

3、将yarn-site.xml同步更新其他节点的配置信息

[kgf@hadoop20 logs]$ cd /opt/module/HA/hadoop-2.7.2/
[kgf@hadoop20 hadoop-2.7.2]$ xsync etc/hadoop/yarn-site.xml
fname=yarn-site.xml
pdir=/opt/module/HA/hadoop-2.7.2/etc/hadoop
----------hadoop21--------
sending incremental file list
yarn-site.xml

sent 1430 bytes  received 43 bytes  2946.00 bytes/sec
total size is 2047  speedup is 1.39
------hadoop22--------
sending incremental file list
yarn-site.xml

sent 1430 bytes  received 43 bytes  2946.00 bytes/sec
total size is 2047  speedup is 1.39
[kgf@hadoop20 hadoop-2.7.2]$
[kgf@hadoop20 hadoop-2.7.2]$

4、启动YARN

4.1、hadoop20中执行

[kgf@hadoop20 hadoop-2.7.2]$ sbin/start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /opt/module/HA/hadoop-2.7.2/logs/yarn-kgf-resourcemanager-hadoop20.out
hadoop22: starting nodemanager, logging to /opt/module/HA/hadoop-2.7.2/logs/yarn-kgf-nodemanager-hadoop22.out
hadoop21: starting nodemanager, logging to /opt/module/HA/hadoop-2.7.2/logs/yarn-kgf-nodemanager-hadoop21.out
hadoop20: starting nodemanager, logging to /opt/module/HA/hadoop-2.7.2/logs/yarn-kgf-nodemanager-hadoop20.out
[kgf@hadoop20 hadoop-2.7.2]$

4.2、hadoop21中执行:

[kgf@hadoop21 hadoop-2.7.2]$ sbin/yarn-daemon.sh start resourcemanager
starting resourcemanager, logging to /opt/module/HA/hadoop-2.7.2/logs/yarn-kgf-resourcemanager-hadoop21.out
[kgf@hadoop21 hadoop-2.7.2]$ jps
8482 ResourceManager
8342 NodeManager
7864 NameNode
8536 Jps
5289 DFSZKFailoverController
1739 QuorumPeerMain
5069 DataNode
5167 JournalNode
[kgf@hadoop21 hadoop-2.7.2]$

4.3、查看服务状态

说明目前hadoop20节点的yarn是active的,hadoop21节点是standby的,我们将hadoop20的节点杀掉试试:

[kgf@hadoop20 hadoop-2.7.2]$ jps
4320 NameNode
4432 DataNode
6210 NodeManager
6100 ResourceManager
4825 DFSZKFailoverController
6523 Jps
1820 QuorumPeerMain
4636 JournalNode
[kgf@hadoop20 hadoop-2.7.2]$ kill -9 6100
[kgf@hadoop20 hadoop-2.7.2]$

 可以发现hadoop21可以访问了,变成active了

以上是关于HDFS HA之手动高可用故障转移配置自动高可用故障转移配置配置YARN-HA集群的主要内容,如果未能解决你的问题,请参考以下文章

HadoopHDFS HA高可用

Hdfs的HA高可用

3.配置HDFS HA

VCSA 6.5 HA 配置之五:故障转移测试

CDH6.3配置HDFS高可用,多NameNode

Hadoop高可用HA 开发常用Linux命令