大数据学习环境搭建(CentOS6.9+Hadoop2.7.3+Hive1.2.1+Hbase1.3.1+Spark2.1.1)
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了大数据学习环境搭建(CentOS6.9+Hadoop2.7.3+Hive1.2.1+Hbase1.3.1+Spark2.1.1)相关的知识,希望对你有一定的参考价值。
node1 192.168.1.11 | node2 192.168.1.12 | node3 192.168.1.13 | 备注 | ||
NameNode | Hadoop | Y | Y | 高可用 | |
DateNode | Y | Y | |||
ResourceManager | Y | 高可用 | |||
NodeManager | Y | Y | Y | ||
JournalNodes | Y | Y | Y | 奇数个,至少3个节点 | |
ZKFC(DFSZKFailoverController) | Y | Y | 有namenode的地方就有ZKFC | ||
QuorumPeerMain | Zookeeper | Y | Y | Y | |
HIVE | Hive元数据库 | ||||
Metastore(RunJar) | |||||
HIVE(RunJar) | Y | ||||
HMaster | HBase | Y | Y | 高可用 | |
HRegionServer | Y | Y | Y | ||
Spark(Master) | Spark | Y | 高可用 | ||
Spark(Worker) | Y | Y |
apache-ant-1.9.9-bin.tar.gzapache-hive-1.2.1-bin.tar.gzapache-maven-3.3.9-bin.tar.gzapache-tomcat-6.0.44.tar.gzCentOS-6.9-x86_64-minimal.isofindbugs-3.0.1.tar.gzhadoop-2.7.3-src.tar.gzhadoop-2.7.3.tar.gzhadoop-2.7.3(自已编译的centOS6.9版本).tar.gzhbase-1.3.1-bin(自己编译).tar.gzhbase-1.3.1-src.tar.gzjdk-8u121-linux-x64.tar.gzmysql-connector-java-5.6-bin.jarprotobuf-2.5.0.tar.gzscala-2.11.11.tgzsnappy-1.1.3.tar.gzspark-2.1.1-bin-hadoop2.7.tgz
关闭防火墙
[root@node1 ~]# service iptables stop
[root@node1 ~]# chkconfig iptables off
zookeeper
[root@node1 ~]# wget -O /root/zookeeper-3.4.9.tar.gz https://mirrors.tuna.tsinghua.edu.cn/apache/zookeeper/zookeeper-3.4.9/zookeeper-3.4.9.tar.gz
[root@node1 ~]# tar -zxvf /root/zookeeper-3.4.9.tar.gz -C /root
[root@node1 ~]# cp /root/zookeeper-3.4.9/conf/zoo_sample.cfg /root/zookeeper-3.4.9/conf/zoo.cfg
[root@node1 ~]# vi /root/zookeeper-3.4.9/conf/zoo.cfg
[root@node1 ~]# vi /root/zookeeper-3.4.9/bin/zkEnv.sh
[root@node1 ~]# mkdir /root/zookeeper-3.4.9/logs
[root@node1 ~]# vi /root/zookeeper-3.4.9/conf/log4j.properties
[root@node1 ~]# mkdir /root/zookeeper-3.4.9/zkData
[root@node1 ~]# scp -r /root/zookeeper-3.4.9 node2:/root
[root@node1 ~]# scp -r /root/zookeeper-3.4.9 node3:/root
[root@node1 ~]# touch /root/zookeeper-3.4.9/zkData/myid
[root@node1 ~]# echo 1 > /root/zookeeper-3.4.9/zkData/myid
[root@node2 ~]# touch /root/zookeeper-3.4.9/zkData/myid
[root@node2 ~]# echo 2 > /root/zookeeper-3.4.9/zkData/myid
[root@node3 ~]# touch /root/zookeeper-3.4.9/zkData/myid
[root@node3 ~]# echo 3 > /root/zookeeper-3.4.9/zkData/myid
环境变量
[root@node1 ~]# vi /etc/profile
export JAVA_HOME=/root/jdk1.8.0_121export SCALA_HOME=/root/scala-2.11.11export HADOOP_HOME=/root/hadoop-2.7.3export HIVE_HOME=/root/apache-hive-1.2.1-binexport HBASE_HOME=/root/hbase-1.3.1export SPARK_HOME=/root/spark-2.1.1-bin-hadoop2.7export PATH=.:$PATH:$JAVA_HOME/bin:$SCALA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:/root:$HIVE_HOME/bin:$HBASE_HOME/bin:$SPARK_HOMEexport CLASSPATH=.:$JAVA_HOME/jre/lib/rt.jar:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
[root@node1 ~]# source /etc/profile
[root@node1 ~]# scp /etc/profile node2:/etc
[root@node2 ~]# source /etc/profile
[root@node1~]# scp /etc/profile node3:/etc
[root@node3 ~]# source /etc/profile
Hadoop
[root@node1 ~]# wget -O /root/hadoop-2.7.3.tar.gz http://mirror.bit.edu.cn/apache/hadoop/common/hadoop-2.7.3/hadoop-2.7.3.tar.gz
[root@node1 ~]# tar -zxvf /root/hadoop-2.7.3.tar.gz -C /root
[root@node1 ~]# vi /root/hadoop-2.7.3/etc/hadoop/hadoop-env.sh
[root@node1 ~]# vi /root/hadoop-2.7.3/etc/hadoop/hdfs-site.xml
<property><name>dfs.replication</name><value>2</value></property><property><name>dfs.blocksize</name><value>64m</value></property><property><name>dfs.permissions.enabled</name><value>false</value></property><property><name>dfs.nameservices</name><value>mycluster</value></property><property><name>dfs.ha.namenodes.mycluster</name><value>nn1,nn2</value></property><property><name>dfs.namenode.rpc-address.mycluster.nn1</name><value>node1:8020</value></property><property><name>dfs.namenode.rpc-address.mycluster.nn2</name><value>node2:8020</value></property><property><name>dfs.namenode.http-address.mycluster.nn1</name><value>node1:50070</value></property><property><name>dfs.namenode.http-address.mycluster.nn2</name><value>node2:50070</value></property><property><name>dfs.namenode.shared.edits.dir</name><value>qjournal://node1:8485;node2:8485;node3:8485/mycluster</value></property><property><name>dfs.journalnode.edits.dir</name><value>/root/hadoop-2.7.3/tmp/journal</value></property><property><name>dfs.ha.automatic-failover.enabled.mycluster</name><value>true</value></property><property><name>dfs.client.failover.proxy.provider.mycluster</name><value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value></property><property><name>dfs.ha.fencing.methods</name><value>sshfence</value></property><property><name>dfs.ha.fencing.ssh.private-key-files</name><value>/root/.ssh/id_rsa</value></property>
[root@node1 ~]# vi /root/hadoop-2.7.3/etc/hadoop/core-site.xml
<property><name>fs.defaultFS</name><value>hdfs://mycluster</value></property><property><name>hadoop.tmp.dir</name><value>/root/hadoop-2.7.3/tmp</value></property><property><name>ha.zookeeper.quorum</name><value>node1:2181,node2:2181,node3:2181</value></property>
[root@node1 ~]# vi /root/hadoop-2.7.3/etc/hadoop/slaves
node1node2node3
[root@node1 ~]# vi /root/hadoop-2.7.3/etc/hadoop/yarn-env.sh
[root@node1 ~]# vi /root/hadoop-2.7.3/etc/hadoop/mapred-site.xml
<configuration><property><name>mapreduce.framework.name</name><value>yarn</value></property><property><name>mapreduce.jobhistory.address</name><value>node1:10020</value></property><property><name>mapreduce.jobhistory.webapp.address</name><value>node1:19888</value></property><property><name>mapreduce.jobhistory.max-age-ms</name><value>6048000000</value></property></configuration>
[root@node1 ~]# vi /root/hadoop-2.7.3/etc/hadoop/yarn-site.xml
<property><name>yarn.nodemanager.aux-services</name><value>mapreduce_shuffle</value></property><property><name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name><value>org.apache.hadoop.mapred.ShuffleHandler</value></property><property><name>yarn.resourcemanager.ha.enabled</name><value>true</value></property><property><name>yarn.resourcemanager.cluster-id</name><value>yarn-cluster</value></property><property><name>yarn.resourcemanager.ha.rm-ids</name><value>rm1,rm2</value></property><property><name>yarn.resourcemanager.hostname.rm1</name><value>node1</value></property><property>以上是关于大数据学习环境搭建(CentOS6.9+Hadoop2.7.3+Hive1.2.1+Hbase1.3.1+Spark2.1.1)的主要内容,如果未能解决你的问题,请参考以下文章
JDK1.10+scala环境的搭建之linux环境(centos6.9)