Hadoop高可用(HA)集群部署
Posted VV大数据
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Hadoop高可用(HA)集群部署相关的知识,希望对你有一定的参考价值。
VMWare
三台虚拟机:Centos6.5 x64 操作系统(需自行安装,先安装一台,然后克隆两台)
Hadoop 2.7.3 64位安装包
JDK 1.8
Zookeeper 3.4.9 安装包
VV整理了本篇文章所用的软件包
想要更多大数据学习资料吗?赶快关注吧
主机名 |
安装的软件 |
运行的进程 |
|
louisvv01 (主节点) |
192.168.1.210 |
JDK、Hadoop |
NameNode、ResourceManager、QuorumPeerMain DataNode、DFSZKFailoverController、NodeManager、JournalNode |
louisvv02 |
192.168.1.211 |
JDK、Hadoop |
NameNode(StandBy)、ResourceManager(StandBy)、QuorumPeerMain DataNode、DFSZKFailoverController、NodeManager、JournalNode |
louisvv03 |
192.168.1.212 |
JDK、Hadoop |
QuorumPeerMain、DataNode、NodeManager、JournalNode |
2.修改主机名,根据集群规划来进行修改(需要在三台机器上都进行配置)
临时修改主机名
[root@louisvv ~]# hostname louisvv01 [root@louisvv ~]# hostname louisvv01 |
永久修改主机名
[root@louisvv ~]# vim /etc/sysconfig/network NETWORKING=yes HOSTNAME=louisvv01 |
3.修改主机名和IP的映射关系/etc/hosts(需要在三台机器上都进行配置)
[root@louisvv ~]# vim /etc/hosts 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 192.168.1.210 louisvv01 192.168.1.211 louisvv02 192.168.1.212 louisvv03 |
4.关闭防火墙,关闭防火墙开启自启(需要在三台机器上都进行配置)
[root@louisvv ~]# service iptables stop [root@louisvv ~]# chkconfig iptables off |
5.ssh免登陆
[root@louisvv01 ~]# ssh-keygen -t rsa Generating public/private rsa key pair. Enter file in which to save the key (/root/.ssh/id_rsa): Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /root/.ssh/id_rsa. Your public key has been saved in /root/.ssh/id_rsa.pub. The key fingerprint is: 2e:75:72:ff:54:65:51:e7:e6:cb:42:e9:45:3e:ac:7f root@louisvv.com The key's randomart image is: +--[ RSA 2048]----+ | .+| | .o| | .=| | =+.| | S o o =o| | o + .o +.o| | . . .+.o | | . oo E| | ...| +-----------------+ |
执行完上面的命令后,会生成两个文件id_rsa(私钥)、id_rsa.pub(公钥),将公钥拷贝到要免登陆的机器上
[root@louisvv ~]# ssh-copy-id louisvv02 [root@louisvv ~]# ssh-copy-id louisvv03 |
测试一下,ssh免登陆
[root@louisvv ~]# ssh louisvv02 Last login: Wed Dec 6 10:27:15 2017 from 192.168.1.74 [root@louisvv02 ~]# [root@louisvv ~]# ssh louisvv03 Last login: Wed Dec 6 10:27:20 2017 from 192.168.1.74 |
6.安装JDK
使用winscp软件将JDK包上传到louisvv01机器上/opt目录下
[root@louisvv ~]# cd /opt/ [root@louisvv opt]# ls jdk-8u91-linux-x64.tar.gz [root@louisvv opt]# tar -zxf jdk-8u91-linux-x64.tar.gz [root@louisvv opt]# ls jdk1.8.0_91 jdk-8u91-linux-x64.tar.gz |
配置JDK环境变量
[root@louisvv opt]# vim /etc/profile #java env export JAVA_HOME=/opt/jdk1.8.0_91 export JRE_HOME=/opt/jdk1.8.0_91/jre export CLASSPATH=$JAVA_HOME/lib export PATH=:$PATH:$JAVA_HOME/bin:$JRE_HOME/bin: |
刷新环境变量,并测试java环境变量配置是否成功
[root@louisvv opt]# source /etc/profile [root@louisvv opt]# java -version java version "1.8.0_91" Java(TM) SE Runtime Environment (build 1.8.0_91-b14) Java HotSpot(TM) 64-Bit Server VM (build 25.91-b14, mixed mode) |
配置好JDK后,将配置好的JDK和/etc/profile远程拷贝到另外两台机器上
[root@louisvv opt]# scp -r jdk1.8.0_91/ louisvv02:/opt/ [root@louisvv opt]# scp -r jdk1.8.0_91/ louisvv03:/opt/ [root@louisvv opt]# scp /etc/profile louisvv02:/etc/ [root@louisvv opt]# scp /etc/profile louisvv03:/etc/ |
在louisvv02,louivv03上刷新环境变量,并测试java环境变量是否配置成功
[root@louisvv ~]# source /etc/profile [root@louisvv ~]# java -version java version "1.8.0_91" Java(TM) SE Runtime Environment (build 1.8.0_91-b14) Java HotSpot(TM) 64-Bit Server VM (build 25.91-b14, mixed mode) |
配置好基础环境后,要进行Zookeeper集群的安装
1.使用winscp软件,将zookeeper-3.4.9.tar.gz安装包上传到/opt目录下
2.解压
[root@louisvv opt]# tar -zxf zookeeper-3.4.9.tar.gz |
3.配置zookeeper
[root@louisvv opt]# cd zookeeper-3.4.9 [root@louisvv zookeeper-3.4.9]# cd conf/ [root@louisvv conf]# ls configuration.xsl log4j.properties zoo_sample.cfg [root@louisvv conf]# cp zoo_sample.cfg zoo.cfg |
修改配置文件
[root@louisvv conf]# vim zoo.cfg dataDir=/opt/zookeeper-3.4.9/data/ 并在文件最后添加 server.1=louisvv01:2888:3888 server.2=louisvv02:2888:3888 server.3=louisvv03:2888:3888 |
创建相应的data dir目录和myid文件,其中,这个myid文件必须创建,否则启动会报错
myid一定对应好zoo.cfg中配置的server后面1、2、3
[root@louisvv conf]# mkdir /opt/zookeeper-3.4.9/data [root@louisvv data]# vim /opt/zookeeper-3.4.9/data/myid 1 |
将配置好的zookeeper scp拷贝到另外两台机器
[root@louisvv opt]# scp -r /opt/zookeeper-3.4.9 louisvv02:/opt/ [root@louisvv opt]# scp -r /opt/zookeeper-3.4.9 louisvv03:/opt/ |
在louisvv02、louisvv03上分别修改/opt/zookeeper-3.4.9/data/myid 文件,改为对应的id号
louisvv02为2,louisvv03为3
4.添加环境变量(在louisvv02、louisvv03上也进行配置)
[root@louisvv zookeeper-3.4.9]# vim /etc/profile #zookeeper env export ZOOKEEPER_HOME=/opt/zookeeper-3.4.9 #hadoop env export HADOOP_HOME=/opt/hadoop-2.7.3 export PATH=:$PATH:$JAVA_HOME/bin:$JRE_HOME/bin:$ZOOKEEPER_HOME/bin: 保存退出,刷新环境变量 [root@louisvv zookeeper-3.4.9]# source /etc/profile [root@louisvv zookeeper-3.4.9]# zk zkCleanup.sh zkCli.cmd zkCli.sh zkEnv.cmd zkEnv.sh zkServer.cmd zkServer.sh |
5.开启zookeeper(在三台机器上执行)
[root@louisvv zookeeper-3.4.9]# zkServer.sh start ZooKeeper JMX enabled by default Using config: /opt/zookeeper-3.4.9/bin/../conf/zoo.cfg Starting zookeeper ... STARTED |
使用jps命令查看zookeeper对应java进程,发现QuorumPeerMain正在运行,即zookeeper开启成功
[root@louisvv zookeeper-3.4.9]# jps 2713 Jps 2671 QuorumPeerMain |
检查zookeeper集群是否启动成功,发现三台机器上,有两台是follower,一台是leader,即zookeeper安装成功,并成功即启动
[root@louisvv zookeeper-3.4.9]# zkServer.sh status ZooKeeper JMX enabled by default Using config: /opt/zookeeper-3.4.9/bin/../conf/zoo.cfg Mode: follower [root@louisvv03 ~]# zkServer.sh status ZooKeeper JMX enabled by default Using config: /opt/zookeeper-3.4.9/bin/../conf/zoo.cfg Mode: leader [root@louisvv02 ~]# zkServer.sh status ZooKeeper JMX enabled by default Using config: /opt/zookeeper-3.4.9/bin/../conf/zoo.cfg Mode: follower |
使用winscp软件将hadoop-2.7.3.tar.gz包上传到louisvv01机器的 /opt目录下
1.解压
[root@louisvv opt]# tar -zxf hadoop-2.7.3.tar.gz |
2.进入解压后目录,修改配置文件
[root@louisvv opt]# cd /opt/hadoop-2.7.3/etc/hadoop/ |
先修改hadoop-env.sh修改JAVA HOME
[root@louisvv hadoop]# vim hadoop-env.sh export JAVA_HOME=/opt/jdk1.8.0_91 |
修改core-site.xml添加如下内容:
<configuration> <!-- 指定hdfs的nameservice为ns1 --> <property> <name>fs.defaultFS</name> <value>hdfs://ns1</value> </property> <!-- 指定hadoop临时目录 --> <property> <name>hadoop.tmp.dir</name> <value>/opt/hadoop-2.7.3/tmp</value> </property> <property> <name>ha.zookeeper.quorum</name> <value>louisvv01:2181,louisvv02:2181,louisvv03:2181</value> </property> </configuration> |
修改hdfs-site.xml添加如下内容:
<configuration> <!--指定hdfs的nameservice为ns1,需要和core-site.xml中的保持一致 --> <property> <name>dfs.nameservices</name> <value>ns1</value> </property> <!-- ns1下面有两个NameNode,分别是nn1,nn2 --> <property> <name>dfs.ha.namenodes.ns1</name> <value>nn1,nn2</value> </property> <property> <name>dfs.namenode.rpc-address.ns1.nn1</name> <value>louisvv01:9000</value> </property> <property> <name>dfs.namenode.http-address.ns1.nn1</name> <value>louisvv01:50070</value> </property> <property> <name>dfs.namenode.rpc-address.ns1.nn2</name> <value>louisvv02:9000</value> </property> <property> <name>dfs.namenode.http-address.ns1.nn2</name> <value>louisvv02:50070</value> </property> <!-- 指定NameNode的元数据在JournalNode上的存放位置 --> <property> <name>dfs.namenode.shared.edits.dir</name> <value>qjournal://louisvv01:8485;louisvv02:8485;louisvv03:8485/ns1</value> </property> <!-- 指定JournalNode在本地磁盘存放数据的位置 --> <property> <name>dfs.journalnode.edits.dir</name> <value>/opt/hadoop-2.7.3/journal</value> </property> <!-- 开启NameNode失败自动切换 --> <property> <name>dfs.ha.automatic-failover.enabled</name> <value>true</value> </property> <!-- 配置失败自动切换实现方式 --> <property> <name>dfs.client.failover.proxy.provider.ns1</name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> </property> <!-- 配置隔离机制方法,多个机制用换行分割,即每个机制暂用一行--> <property> <name>dfs.ha.fencing.methods</name> <value> sshfence shell(/bin/true) </value> </property> <!-- 使用sshfence隔离机制时需要ssh免登陆 --> <property> <name>dfs.ha.fencing.ssh.private-key-files</name> <value>/home/hadoop/.ssh/id_rsa</value> </property> <!-- 配置sshfence隔离机制超时时间 --> <property> <name>dfs.ha.fencing.ssh.connect-timeout</name> <value>30000</value> </property> </configuration> |
复制mapred-site.xml实例文件
[root@louisvv hadoop]# cp mapred-site.xml.template mapred-site.xml |
修改mapred-site.xml文件,添加如下内容:
<configuration> <!-- 指定mr框架为yarn方式 --> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> </configuration> |
修改yarn-site.xml文件,添加如下内容:
<configuration> <!-- 开启RM高可靠 --> <property> <name>yarn.resourcemanager.ha.enabled</name> <value>true</value> </property> <!-- 指定RM的cluster id --> <property> <name>yarn.resourcemanager.cluster-id</name> <value>yrc</value> </property> <!-- 指定RM的名字 --> <property> <name>yarn.resourcemanager.ha.rm-ids</name> <value>rm1,rm2</value> </property> <property> <name>yarn.resourcemanager.hostname.rm1</name> <value>louisvv01</value> </property> <property> <name>yarn.resourcemanager.hostname.rm2</name> <value>louisvv02</value> </property> <property> <name>yarn.resourcemanager.zk-address</name> <value>louisvv01:2181,louisvv02:2181,louisvv03:2181</value> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> </configuration> |
修改slaves文件,添加如下内容:
[root@louisvv hadoop]# vim slaves louisvv01 louisvv02 louisvv03 |
将配置好的hadoop拷贝到另外两个节点
[root@louisvv opt]# scp -r /opt/hadoop-2.7.3 louisvv02:/opt/ [root@louisvv opt]# scp -r /opt/hadoop-2.7.3 louisvv03:/opt/ |
将hadoop添加到环境变量(三台机器上全部操作)
export HADOOP_HOME=/opt/hadoop-2.7.3 export PATH=:$PATH:$JAVA_HOME/bin:$JRE_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$ZOOKEEPER_HOME/bin: |
六:开启hadoop集群
1.先开启journalnode(在主节点louisvv01上执行)
[root@louisvv opt]# hadoop-daemons.sh start journalnode louisvv01: starting journalnode, logging to /opt/hadoop-2.7.3/logs/hadoop-root-journalnode-louisvv01.out louisvv02: starting journalnode, logging to /opt/hadoop-2.7.3/logs/hadoop-root-journalnode-louisvv02.out louisvv03: starting journalnode, logging to /opt/hadoop-2.7.3/logs/hadoop-root-journalnode-louisvv03.out |
jps查看相应java进程,查看是否三台机器都起来了
[root@louisvv opt]# jps 3162 JournalNode 3211 Jps 2671 QuorumPeerMain |
2.格式化hdfs,在主节点louisvv01上执行
[root@louisvv opt]# hdfs namenode -format 17/12/06 14:18:12 INFO common.Storage: Storage directory /opt/hadoop-2.7.3/tmp/dfs/name has been successfully formatted. |
格式化后会在根据core-site.xml中的hadoop.tmp.dir配置生成个文件
这里我配置的是 /opt/hadoop-2.7.3/tmp/,然后将 /opt/hadoop-2.7.3/tmp/拷贝到 /opt/hadoop-2.7.3/目录下
[root@louisvv hadoop-2.7.3]# scp -r /opt/hadoop-2.7.3/tmp/ louisvv02:/opt/hadoop-2.7.3/ |
3.格式化zk,在louisvv01上执行
[root@louisvv hadoop-2.7.3]# hdfs zkfc -formatZK 17/12/06 15:05:32 INFO ha.ActiveStandbyElector: Successfully created /hadoop-ha/ns1 in ZK. |
4.开启hdfs
[root@louisvv hadoop-2.7.3]# start-dfs.sh Starting namenodes on [louisvv01 louisvv02] louisvv02: starting namenode, logging to /opt/hadoop-2.7.3/logs/hadoop-root-namenode-louisvv02.out louisvv01: starting namenode, logging to /opt/hadoop-2.7.3/logs/hadoop-root-namenode-louisvv01.out louisvv03: starting datanode, logging to /opt/hadoop-2.7.3/logs/hadoop-root-datanode-louisvv03.out louisvv01: starting datanode, logging to /opt/hadoop-2.7.3/logs/hadoop-root-datanode-louisvv01.out louisvv02: starting datanode, logging to /opt/hadoop-2.7.3/logs/hadoop-root-datanode-louisvv02.out Starting journal nodes [louisvv01 louisvv02 louisvv03] 由于我们前面开启了journalnode,所以这里显示journalnode已运行 louisvv02: journalnode running as process 3259. Stop it first. louisvv01: journalnode running as process 3162. Stop it first. louisvv03: journalnode running as process 3273. Stop it first. Starting ZK Failover Controllers on NN hosts [louisvv01 louisvv02] louisvv01: starting zkfc, logging to /opt/hadoop-2.7.3/logs/hadoop-root-zkfc-louisvv01.out louisvv02: starting zkfc, logging to /opt/hadoop-2.7.3/logs/hadoop-root-zkfc-louisvv02.out |
jps命令查看进程是否启动
louisvv01:
[root@louisvv01 ~]# jps 3954 Jps 3527 DataNode 3162 JournalNode 3434 NameNode 3838 DFSZKFailoverController 2671 QuorumPeerMain |
louisvv02:
[root@louisvv02 ~]# jps 3466 DataNode 2939 QuorumPeerMain 3259 JournalNode 3612 DFSZKFailoverController 3727 Jps 3407 NameNode |
louisvv03:
[root@louisvv03 ~]# jps 3520 Jps 3381 DataNode 2937 QuorumPeerMain 3273 JournalNode |
5.开启yarn
[root@louisvv01 ~]# start-yarn.sh starting yarn daemons starting resourcemanager, logging to /opt/hadoop-2.7.3/logs/yarn-root-resourcemanager-louisvv01.out louisvv02: starting nodemanager, logging to /opt/hadoop-2.7.3/logs/yarn-root-nodemanager-louisvv02.out louisvv03: starting nodemanager, logging to /opt/hadoop-2.7.3/logs/yarn-root-nodemanager-louisvv03.out louisvv01: starting nodemanager, logging to /opt/hadoop-2.7.3/logs/yarn-root-nodemanager-louisvv01.out |
开启ResourceManager备节点
在louisvv02上执行:
[root@louisvv02 ~]# yarn-daemon.sh start resourcemanager starting resourcemanager, logging to /opt/hadoop-2.7.3/logs/yarn-root-resourcemanager-louisvv02.out |
6.查看jps
louisvv01:
[root@louisvv01 ~]# jps 4100 NodeManager 4005 ResourceManager 3527 DataNode 3162 JournalNode 3434 NameNode 4396 Jps 3838 DFSZKFailoverController 2671 QuorumPeerMain |
louisvv02:
[root@louisvv02 ~]# jps 3892 ResourceManager 3770 NodeManager 3466 DataNode 2939 QuorumPeerMain 3259 JournalNode 3612 DFSZKFailoverController 3967 Jps 3407 NameNode |
louisvv03:
[root@louisvv03 ~]# jps 3554 NodeManager 3381 DataNode 3654 Jps 2937 QuorumPeerMain 3273 JournalNode |
7.在web页面上查看hdfs
一个Active,一个Standby,hdfs HA成功
8.在web页面上查看yarn
在网页上输入http://192.168.1.210:8088/cluster/cluster,ResourceManger状态为Active
在网页上输入http://192.168.1.211:8088/cluster/cluster,ResourceManger状态为StandBy
ResourceManger HA成功!
9.测试HDFS的HA
首先向HDFS上传一个文件
[root@louisvv01 ~]# cat test.txt hello world hello you hello me hello louisvv [root@louisvv01 ~]# hadoop fs -put test.txt / [root@louisvv01 ~]# hadoop fs -ls / Found 1 items -rw-r--r-- 3 root supergroup 45 2017-12-06 15:25 /test.txt |
然后我们杀掉louisvv01上那个active的Namenode
[root@louisvv01 ~]# jps 4100 NodeManager 4005 ResourceManager 3527 DataNode 3162 JournalNode 3434 NameNode 4539 Jps 3838 DFSZKFailoverController 2671 QuorumPeerMain [root@louisvv01 ~]# kill -9 3434 [root@louisvv01 ~]# jps 4100 NodeManager 4005 ResourceManager 4549 Jps 3527 DataNode 3162 JournalNode 3838 DFSZKFailoverController 2671 QuorumPeerMain |
已经将Namenode的进程杀死
去louisvv02 hdfs web页面查看其状态是否切换
发现其状态为Active
再查看一下刚才上传的文件,刚才上传的文件也在
[root@louisvv01 ~]# hadoop fs -ls / Found 1 items -rw-r--r-- 3 root supergroup 45 2017-12-06 15:25 /test.txt |
将刚才杀掉的Namenode再启动起来
[root@louisvv01 ~]# hadoop-daemon.sh start namenode starting namenode, logging to /opt/hadoop-2.7.3/logs/hadoop-root-namenode-louisvv01.out |
通过web页面查看,其状态为StandBy
10.经典的wordcount
[root@louisvv01 ~]# hadoop jar /opt/hadoop-2.7.3/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar wordcount /test.txt /out1 17/12/06 15:38:06 INFO mapreduce.Job: map 0% reduce 0% 17/12/06 15:38:24 INFO mapreduce.Job: map 100% reduce 0% 17/12/06 15:38:56 INFO mapreduce.Job: map 100% reduce 100% [root@louisvv01 ~]# hadoop fs -ls /out1/ Found 2 items -rw-r--r-- 3 root supergroup 0 2017-12-06 15:38 /out1/_SUCCESS -rw-r--r-- 3 root supergroup 37 2017-12-06 15:38 /out1/part-r-00000 [root@louisvv01 ~]# hadoop fs -cat /out1/part-r-00000 hello 4 louisvv 1 me 1 world 1 you 1 |
好了,本篇hadoop高可用集群搭建写完了
记得点赞和分享哦!
END
►
►
►
►
►
►
►
以上是关于Hadoop高可用(HA)集群部署的主要内容,如果未能解决你的问题,请参考以下文章