Hadoop高可用(HA)集群部署

Posted VV大数据

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Hadoop高可用(HA)集群部署相关的知识,希望对你有一定的参考价值。


Hadoop高可用(HA)集群部署


1
部署前准备工作

VMWare

三台虚拟机:Centos6.5 x64 操作系统(需自行安装,先安装一台,然后克隆两台)

Hadoop 2.7.3 64位安装包

JDK 1.8

Zookeeper 3.4.9 安装包

Hadoop高可用(HA)集群部署


Hadoop高可用(HA)集群部署

VV整理了本篇文章所用的软件包

想要更多大数据学习资料吗?赶快关注吧

Hadoop高可用(HA)集群部署


2
集群规划


主机名

安装的软件

运行的进程

louisvv01 (主节点)

192.168.1.210

JDK、Hadoop

NameNode、ResourceManager、QuorumPeerMain

DataNode、DFSZKFailoverController、NodeManager、JournalNode

louisvv02

192.168.1.211

JDK、Hadoop

NameNode(StandBy)、ResourceManager(StandBy)、QuorumPeerMain

DataNode、DFSZKFailoverController、NodeManager、JournalNode

louisvv03

192.168.1.212

JDK、Hadoop

QuorumPeerMain、DataNode、NodeManager、JournalNode


3
基础环境


Hadoop高可用(HA)集群部署

2.修改主机名,根据集群规划来进行修改(需要在三台机器上都进行配置)

临时修改主机名

[root@louisvv ~]# hostname louisvv01

[root@louisvv ~]# hostname

louisvv01

永久修改主机名

[root@louisvv ~]# vim /etc/sysconfig/network

NETWORKING=yes

HOSTNAME=louisvv01

3.修改主机名和IP的映射关系/etc/hosts(需要在三台机器上都进行配置)

[root@louisvv ~]# vim /etc/hosts

127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4

::1 localhost localhost.localdomain localhost6 localhost6.localdomain6

192.168.1.210 louisvv01

192.168.1.211 louisvv02

192.168.1.212 louisvv03

4.关闭防火墙,关闭防火墙开启自启(需要在三台机器上都进行配置)

[root@louisvv ~]# service iptables stop

[root@louisvv ~]# chkconfig iptables off

5.ssh免登陆

[root@louisvv01 ~]# ssh-keygen -t rsa

Generating public/private rsa key pair.

Enter file in which to save the key (/root/.ssh/id_rsa):

Enter passphrase (empty for no passphrase):

Enter same passphrase again:

Your identification has been saved in /root/.ssh/id_rsa.

Your public key has been saved in /root/.ssh/id_rsa.pub.

The key fingerprint is:

2e:75:72:ff:54:65:51:e7:e6:cb:42:e9:45:3e:ac:7f root@louisvv.com

The key's randomart image is:

+--[ RSA 2048]----+

| .+|

| .o|

| .=|

| =+.|

| S o o =o|

| o + .o +.o|

| . . .+.o |

| . oo E|

| ...|

+-----------------+

执行完上面的命令后,会生成两个文件id_rsa(私钥)、id_rsa.pub(公钥),将公钥拷贝到要免登陆的机器上

[root@louisvv ~]# ssh-copy-id louisvv02

[root@louisvv ~]# ssh-copy-id louisvv03

测试一下,ssh免登陆

[root@louisvv ~]# ssh louisvv02

Last login: Wed Dec 6 10:27:15 2017 from 192.168.1.74

[root@louisvv02 ~]#

[root@louisvv ~]# ssh louisvv03

Last login: Wed Dec 6 10:27:20 2017 from 192.168.1.74

6.安装JDK

使用winscp软件将JDK包上传到louisvv01机器上/opt目录下

[root@louisvv ~]# cd /opt/

[root@louisvv opt]# ls

jdk-8u91-linux-x64.tar.gz

[root@louisvv opt]# tar -zxf jdk-8u91-linux-x64.tar.gz

[root@louisvv opt]# ls

jdk1.8.0_91 jdk-8u91-linux-x64.tar.gz

配置JDK环境变量

[root@louisvv opt]# vim /etc/profile

#java env

export JAVA_HOME=/opt/jdk1.8.0_91

export JRE_HOME=/opt/jdk1.8.0_91/jre

export CLASSPATH=$JAVA_HOME/lib

export PATH=:$PATH:$JAVA_HOME/bin:$JRE_HOME/bin:

刷新环境变量,并测试java环境变量配置是否成功

[root@louisvv opt]# source /etc/profile

[root@louisvv opt]# java -version

java version "1.8.0_91"

Java(TM) SE Runtime Environment (build 1.8.0_91-b14)

Java HotSpot(TM) 64-Bit Server VM (build 25.91-b14, mixed mode)

配置好JDK后,将配置好的JDK和/etc/profile远程拷贝到另外两台机器上

[root@louisvv opt]# scp -r jdk1.8.0_91/ louisvv02:/opt/

[root@louisvv opt]# scp -r jdk1.8.0_91/ louisvv03:/opt/

[root@louisvv opt]# scp /etc/profile louisvv02:/etc/

[root@louisvv opt]# scp /etc/profile louisvv03:/etc/

在louisvv02,louivv03上刷新环境变量,并测试java环境变量是否配置成功

[root@louisvv ~]# source /etc/profile

[root@louisvv ~]# java -version

java version "1.8.0_91"

Java(TM) SE Runtime Environment (build 1.8.0_91-b14)

Java HotSpot(TM) 64-Bit Server VM (build 25.91-b14, mixed mode)


4
Zookeeper集群安装


配置好基础环境后,要进行Zookeeper集群的安装

1.使用winscp软件,将zookeeper-3.4.9.tar.gz安装包上传到/opt目录下

2.解压

[root@louisvv opt]# tar -zxf zookeeper-3.4.9.tar.gz

3.配置zookeeper

[root@louisvv opt]# cd zookeeper-3.4.9

[root@louisvv zookeeper-3.4.9]# cd conf/

[root@louisvv conf]# ls

configuration.xsl log4j.properties zoo_sample.cfg

[root@louisvv conf]# cp zoo_sample.cfg zoo.cfg

修改配置文件

[root@louisvv conf]# vim zoo.cfg

dataDir=/opt/zookeeper-3.4.9/data/

并在文件最后添加

server.1=louisvv01:2888:3888

server.2=louisvv02:2888:3888

server.3=louisvv03:2888:3888

创建相应的data dir目录和myid文件,其中,这个myid文件必须创建,否则启动会报错

myid一定对应好zoo.cfg中配置的server后面1、2、3

[root@louisvv conf]# mkdir /opt/zookeeper-3.4.9/data

[root@louisvv data]# vim /opt/zookeeper-3.4.9/data/myid

1

将配置好的zookeeper scp拷贝到另外两台机器

[root@louisvv opt]# scp -r /opt/zookeeper-3.4.9 louisvv02:/opt/

[root@louisvv opt]# scp -r /opt/zookeeper-3.4.9 louisvv03:/opt/

在louisvv02、louisvv03上分别修改/opt/zookeeper-3.4.9/data/myid 文件,改为对应的id号

louisvv02为2,louisvv03为3

4.添加环境变量(在louisvv02、louisvv03上也进行配置)

[root@louisvv zookeeper-3.4.9]# vim /etc/profile

#zookeeper env

export ZOOKEEPER_HOME=/opt/zookeeper-3.4.9

#hadoop env

export HADOOP_HOME=/opt/hadoop-2.7.3

export PATH=:$PATH:$JAVA_HOME/bin:$JRE_HOME/bin:$ZOOKEEPER_HOME/bin:

保存退出,刷新环境变量

[root@louisvv zookeeper-3.4.9]# source /etc/profile

[root@louisvv zookeeper-3.4.9]# zk

zkCleanup.sh zkCli.cmd zkCli.sh zkEnv.cmd zkEnv.sh zkServer.cmd zkServer.sh

5.开启zookeeper(在三台机器上执行)

[root@louisvv zookeeper-3.4.9]# zkServer.sh start

ZooKeeper JMX enabled by default

Using config: /opt/zookeeper-3.4.9/bin/../conf/zoo.cfg

Starting zookeeper ... STARTED

使用jps命令查看zookeeper对应java进程,发现QuorumPeerMain正在运行,即zookeeper开启成功

[root@louisvv zookeeper-3.4.9]# jps

2713 Jps

2671 QuorumPeerMain

检查zookeeper集群是否启动成功,发现三台机器上,有两台是follower,一台是leader,即zookeeper安装成功,并成功即启动

[root@louisvv zookeeper-3.4.9]# zkServer.sh status

ZooKeeper JMX enabled by default

Using config: /opt/zookeeper-3.4.9/bin/../conf/zoo.cfg

Mode: follower

[root@louisvv03 ~]# zkServer.sh status

ZooKeeper JMX enabled by default

Using config: /opt/zookeeper-3.4.9/bin/../conf/zoo.cfg

Mode: leader

[root@louisvv02 ~]# zkServer.sh status

ZooKeeper JMX enabled by default

Using config: /opt/zookeeper-3.4.9/bin/../conf/zoo.cfg

Mode: follower


5
安装Hadoop


使用winscp软件将hadoop-2.7.3.tar.gz包上传到louisvv01机器的 /opt目录下

1.解压

[root@louisvv opt]# tar -zxf hadoop-2.7.3.tar.gz

2.进入解压后目录,修改配置文件

[root@louisvv opt]# cd /opt/hadoop-2.7.3/etc/hadoop/

先修改hadoop-env.sh修改JAVA HOME

[root@louisvv hadoop]# vim hadoop-env.sh

export JAVA_HOME=/opt/jdk1.8.0_91

修改core-site.xml添加如下内容:

<configuration>

<!-- 指定hdfs的nameservice为ns1 -->

<property>

<name>fs.defaultFS</name>

<value>hdfs://ns1</value>

</property>

<!-- 指定hadoop临时目录 -->

<property>

<name>hadoop.tmp.dir</name>

<value>/opt/hadoop-2.7.3/tmp</value>

</property>

<property>

<name>ha.zookeeper.quorum</name>

<value>louisvv01:2181,louisvv02:2181,louisvv03:2181</value>

</property>

</configuration>

修改hdfs-site.xml添加如下内容:

<configuration>

<!--指定hdfs的nameservice为ns1,需要和core-site.xml中的保持一致 -->

<property>

<name>dfs.nameservices</name>

<value>ns1</value>

</property>

<!-- ns1下面有两个NameNode,分别是nn1,nn2 -->

<property>

<name>dfs.ha.namenodes.ns1</name>

<value>nn1,nn2</value>

</property>

<property>

<name>dfs.namenode.rpc-address.ns1.nn1</name>

<value>louisvv01:9000</value>

</property>

<property>

<name>dfs.namenode.http-address.ns1.nn1</name>

<value>louisvv01:50070</value>

</property>

<property>

<name>dfs.namenode.rpc-address.ns1.nn2</name>

<value>louisvv02:9000</value>

</property>

<property>

<name>dfs.namenode.http-address.ns1.nn2</name>

<value>louisvv02:50070</value>

</property>

<!-- 指定NameNode的元数据在JournalNode上的存放位置 -->

<property>

<name>dfs.namenode.shared.edits.dir</name>

<value>qjournal://louisvv01:8485;louisvv02:8485;louisvv03:8485/ns1</value>

</property>

<!-- 指定JournalNode在本地磁盘存放数据的位置 -->

<property>

<name>dfs.journalnode.edits.dir</name>

<value>/opt/hadoop-2.7.3/journal</value>

</property>

<!-- 开启NameNode失败自动切换 -->

<property>

<name>dfs.ha.automatic-failover.enabled</name>

<value>true</value>

</property>

<!-- 配置失败自动切换实现方式 -->

<property>

<name>dfs.client.failover.proxy.provider.ns1</name>

<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>

</property>

<!-- 配置隔离机制方法,多个机制用换行分割,即每个机制暂用一行-->

<property>

<name>dfs.ha.fencing.methods</name>

<value>

sshfence

shell(/bin/true)

</value>

</property>

<!-- 使用sshfence隔离机制时需要ssh免登陆 -->

<property>

<name>dfs.ha.fencing.ssh.private-key-files</name>

<value>/home/hadoop/.ssh/id_rsa</value>

</property>

<!-- 配置sshfence隔离机制超时时间 -->

<property>

<name>dfs.ha.fencing.ssh.connect-timeout</name>

<value>30000</value>

</property>

</configuration>

复制mapred-site.xml实例文件

[root@louisvv hadoop]# cp mapred-site.xml.template mapred-site.xml

修改mapred-site.xml文件,添加如下内容:

<configuration>

<!-- 指定mr框架为yarn方式 -->

<property>

<name>mapreduce.framework.name</name>

<value>yarn</value>

</property>

</configuration>

修改yarn-site.xml文件,添加如下内容:

<configuration>

<!-- 开启RM高可靠 -->

<property>

<name>yarn.resourcemanager.ha.enabled</name>

<value>true</value>

</property>

<!-- 指定RM的cluster id -->

<property>

<name>yarn.resourcemanager.cluster-id</name>

<value>yrc</value>

</property>

<!-- 指定RM的名字 -->

<property>

<name>yarn.resourcemanager.ha.rm-ids</name>

<value>rm1,rm2</value>

</property>

<property>

<name>yarn.resourcemanager.hostname.rm1</name>

<value>louisvv01</value>

</property>

<property>

<name>yarn.resourcemanager.hostname.rm2</name>

<value>louisvv02</value>

</property>

<property>

<name>yarn.resourcemanager.zk-address</name>

<value>louisvv01:2181,louisvv02:2181,louisvv03:2181</value>

</property>

<property>

<name>yarn.nodemanager.aux-services</name>

<value>mapreduce_shuffle</value>

</property>

</configuration>

修改slaves文件,添加如下内容:

[root@louisvv hadoop]# vim slaves

louisvv01

louisvv02

louisvv03

将配置好的hadoop拷贝到另外两个节点

[root@louisvv opt]# scp -r /opt/hadoop-2.7.3 louisvv02:/opt/

[root@louisvv opt]# scp -r /opt/hadoop-2.7.3 louisvv03:/opt/

将hadoop添加到环境变量(三台机器上全部操作)

export HADOOP_HOME=/opt/hadoop-2.7.3

export PATH=:$PATH:$JAVA_HOME/bin:$JRE_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$ZOOKEEPER_HOME/bin:

六:开启hadoop集群

1.先开启journalnode(在主节点louisvv01上执行)

[root@louisvv opt]# hadoop-daemons.sh start journalnode

louisvv01: starting journalnode, logging to /opt/hadoop-2.7.3/logs/hadoop-root-journalnode-louisvv01.out

louisvv02: starting journalnode, logging to /opt/hadoop-2.7.3/logs/hadoop-root-journalnode-louisvv02.out

louisvv03: starting journalnode, logging to /opt/hadoop-2.7.3/logs/hadoop-root-journalnode-louisvv03.out

jps查看相应java进程,查看是否三台机器都起来了

[root@louisvv opt]# jps

3162 JournalNode

3211 Jps

2671 QuorumPeerMain

2.格式化hdfs,在主节点louisvv01上执行

[root@louisvv opt]# hdfs namenode -format

17/12/06 14:18:12 INFO common.Storage: Storage directory /opt/hadoop-2.7.3/tmp/dfs/name has been successfully formatted.

格式化后会在根据core-site.xml中的hadoop.tmp.dir配置生成个文件

这里我配置的是 /opt/hadoop-2.7.3/tmp/,然后将 /opt/hadoop-2.7.3/tmp/拷贝到 /opt/hadoop-2.7.3/目录下

[root@louisvv hadoop-2.7.3]# scp -r /opt/hadoop-2.7.3/tmp/ louisvv02:/opt/hadoop-2.7.3/

3.格式化zk,在louisvv01上执行

[root@louisvv hadoop-2.7.3]# hdfs zkfc -formatZK

17/12/06 15:05:32 INFO ha.ActiveStandbyElector: Successfully created /hadoop-ha/ns1 in ZK.

4.开启hdfs

[root@louisvv hadoop-2.7.3]# start-dfs.sh

Starting namenodes on [louisvv01 louisvv02]

louisvv02: starting namenode, logging to /opt/hadoop-2.7.3/logs/hadoop-root-namenode-louisvv02.out

louisvv01: starting namenode, logging to /opt/hadoop-2.7.3/logs/hadoop-root-namenode-louisvv01.out

louisvv03: starting datanode, logging to /opt/hadoop-2.7.3/logs/hadoop-root-datanode-louisvv03.out

louisvv01: starting datanode, logging to /opt/hadoop-2.7.3/logs/hadoop-root-datanode-louisvv01.out

louisvv02: starting datanode, logging to /opt/hadoop-2.7.3/logs/hadoop-root-datanode-louisvv02.out

Starting journal nodes [louisvv01 louisvv02 louisvv03]

由于我们前面开启了journalnode,所以这里显示journalnode已运行

louisvv02: journalnode running as process 3259. Stop it first.

louisvv01: journalnode running as process 3162. Stop it first.

louisvv03: journalnode running as process 3273. Stop it first.

Starting ZK Failover Controllers on NN hosts [louisvv01 louisvv02]

louisvv01: starting zkfc, logging to /opt/hadoop-2.7.3/logs/hadoop-root-zkfc-louisvv01.out

louisvv02: starting zkfc, logging to /opt/hadoop-2.7.3/logs/hadoop-root-zkfc-louisvv02.out

jps命令查看进程是否启动

louisvv01:

[root@louisvv01 ~]# jps

3954 Jps

3527 DataNode

3162 JournalNode

3434 NameNode

3838 DFSZKFailoverController

2671 QuorumPeerMain

louisvv02:

[root@louisvv02 ~]# jps

3466 DataNode

2939 QuorumPeerMain

3259 JournalNode

3612 DFSZKFailoverController

3727 Jps

3407 NameNode

louisvv03:

[root@louisvv03 ~]# jps

3520 Jps

3381 DataNode

2937 QuorumPeerMain

3273 JournalNode

5.开启yarn

[root@louisvv01 ~]# start-yarn.sh

starting yarn daemons

starting resourcemanager, logging to /opt/hadoop-2.7.3/logs/yarn-root-resourcemanager-louisvv01.out

louisvv02: starting nodemanager, logging to /opt/hadoop-2.7.3/logs/yarn-root-nodemanager-louisvv02.out

louisvv03: starting nodemanager, logging to /opt/hadoop-2.7.3/logs/yarn-root-nodemanager-louisvv03.out

louisvv01: starting nodemanager, logging to /opt/hadoop-2.7.3/logs/yarn-root-nodemanager-louisvv01.out

开启ResourceManager备节点

在louisvv02上执行:

[root@louisvv02 ~]# yarn-daemon.sh start resourcemanager

starting resourcemanager, logging to /opt/hadoop-2.7.3/logs/yarn-root-resourcemanager-louisvv02.out

6.查看jps

louisvv01:

[root@louisvv01 ~]# jps

4100 NodeManager

4005 ResourceManager

3527 DataNode

3162 JournalNode

3434 NameNode

4396 Jps

3838 DFSZKFailoverController

2671 QuorumPeerMain

louisvv02:

[root@louisvv02 ~]# jps

3892 ResourceManager

3770 NodeManager

3466 DataNode

2939 QuorumPeerMain

3259 JournalNode

3612 DFSZKFailoverController

3967 Jps

3407 NameNode

louisvv03:

[root@louisvv03 ~]# jps

3554 NodeManager

3381 DataNode

3654 Jps

2937 QuorumPeerMain

3273 JournalNode

7.在web页面上查看hdfs

一个Active,一个Standby,hdfs HA成功

Hadoop高可用(HA)集群部署

Hadoop高可用(HA)集群部署

8.在web页面上查看yarn

在网页上输入http://192.168.1.210:8088/cluster/cluster,ResourceManger状态为Active

Hadoop高可用(HA)集群部署

在网页上输入http://192.168.1.211:8088/cluster/cluster,ResourceManger状态为StandBy

Hadoop高可用(HA)集群部署

ResourceManger HA成功!

9.测试HDFS的HA

首先向HDFS上传一个文件

[root@louisvv01 ~]# cat test.txt

hello world

hello you

hello me

hello louisvv

[root@louisvv01 ~]# hadoop fs -put test.txt /

[root@louisvv01 ~]# hadoop fs -ls /

Found 1 items

-rw-r--r-- 3 root supergroup 45 2017-12-06 15:25 /test.txt

然后我们杀掉louisvv01上那个active的Namenode

[root@louisvv01 ~]# jps

4100 NodeManager

4005 ResourceManager

3527 DataNode

3162 JournalNode

3434 NameNode

4539 Jps

3838 DFSZKFailoverController

2671 QuorumPeerMain

[root@louisvv01 ~]# kill -9 3434

[root@louisvv01 ~]# jps

4100 NodeManager

4005 ResourceManager

4549 Jps

3527 DataNode

3162 JournalNode

3838 DFSZKFailoverController

2671 QuorumPeerMain

已经将Namenode的进程杀死

去louisvv02 hdfs web页面查看其状态是否切换

Hadoop高可用(HA)集群部署

发现其状态为Active

再查看一下刚才上传的文件,刚才上传的文件也在

[root@louisvv01 ~]# hadoop fs -ls /

Found 1 items

-rw-r--r-- 3 root supergroup 45 2017-12-06 15:25 /test.txt

将刚才杀掉的Namenode再启动起来

[root@louisvv01 ~]# hadoop-daemon.sh start namenode

starting namenode, logging to /opt/hadoop-2.7.3/logs/hadoop-root-namenode-louisvv01.out

通过web页面查看,其状态为StandBy

Hadoop高可用(HA)集群部署10.经典的wordcount

[root@louisvv01 ~]# hadoop jar

/opt/hadoop-2.7.3/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar

wordcount /test.txt /out1

17/12/06 15:38:06 INFO mapreduce.Job: map 0% reduce 0%

17/12/06 15:38:24 INFO mapreduce.Job: map 100% reduce 0%

17/12/06 15:38:56 INFO mapreduce.Job: map 100% reduce 100%

[root@louisvv01 ~]# hadoop fs -ls /out1/

Found 2 items

-rw-r--r-- 3 root supergroup 0 2017-12-06 15:38 /out1/_SUCCESS

-rw-r--r-- 3 root supergroup 37 2017-12-06 15:38 /out1/part-r-00000

[root@louisvv01 ~]# hadoop fs -cat /out1/part-r-00000

hello 4

louisvv 1

me 1

world 1

you 1

好了,本篇hadoop高可用集群搭建写完了

记得点赞和分享哦!


END


Hadoop高可用(HA)集群部署

Hadoop高可用(HA)集群部署
Hadoop高可用(HA)集群部署
Hello,伙伴们
长按二维码就可以关注我们啦!


大家都在看

► 

► 

► 

► 

►  

► 


以上是关于Hadoop高可用(HA)集群部署的主要内容,如果未能解决你的问题,请参考以下文章

技术交流大数据Hadoop的HA高可用架构集群部署

Hadoop高可用(HA)集群部署

Hadoop HA 高可用原理及部署

hadoop3.1.1 HA高可用分布式集群安装部署

大数据Hadoop-HA-Federation-3.3.1集群高可用联邦安装部署文档(建议收藏哦)

hadoop+zookeeper+hive+HA集群部署笔记