Spark集群框架搭建VM15+CentOS7+Hadoop+Scala+Spark+Zookeeper+HBase+Hive
Posted 轮回路上打碟的小年轻
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Spark集群框架搭建VM15+CentOS7+Hadoop+Scala+Spark+Zookeeper+HBase+Hive相关的知识,希望对你有一定的参考价值。
目录
1 目的
记录Spark集群框架搭建及实验自学心得。
2 准备工作
- VMware 15 Pro
- Centos7
- JDK 1.8
- Hadoop 2.7.2
- SecureCRT version 8.5
- Scala 2.12.7
- Spark 2.3.1
- Zookeeper 3.4.10
- HBase 2.0.2
- Hive 2.3.4
3 安装过程
3.1 在虚拟机中安装CentOS7
3.1.1 虚拟机设置
1)打开VMware15Pro,并创建虚拟机
2)选择典型安装
3)设定稍后安装本地已下载好的Centos7系统。
3.1.2 安装Linux系统
1)载入CentOS7安装文件
2)开启此虚拟机,系统文件自动导入
3)CentOS7系统安装设置
4)考虑到默认安装软件选择是“最小安装”,该方式安装后需要手动添加资源较多,将其更替为“GNOME桌面”
5)用户设置,为了避免后期hadoop集群环境搭建时候反复切换权限用户,可以选择只建立root账户
6)完成安装
3.2 JAVA环境
3.2.1 卸载Linux自带的jdk
1)查看系统自带的jdk
[root@master ~]# java -version
openjdk version "1.8.0_161"
OpenJDK Runtime Environment (build 1.8.0_161-b14)
OpenJDK 64-Bit Server VM (build 25.161-b14, mixed mode)
2)查询系统自带的java文件,根据不同的系统版本,输入rpm -qa | grep jdk
或者rpm -qa | grep java
[root@master ~]# rpm -qa | grep jdk
java-1.7.0-openjdk-headless-1.7.0.171-2.6.13.2.el7.x86_64
java-1.8.0-openjdk-headless-1.8.0.161-2.b14.el7.x86_64
java-1.7.0-openjdk-1.7.0.171-2.6.13.2.el7.x86_64
java-1.8.0-openjdk-1.8.0.161-2.b14.el7.x86_64
copy-jdk-configs-3.3-2.el7.noarch
3)删除noarch文件以外的其他文件,输入rpm -e --nodeps 需要卸载的安装文件名
[root@master ~]# rpm -e --nodeps java-1.7.0-openjdk-headless-1.7.0.171-2.6.13.2.el7.x86_64
[root@master ~]# rpm -e --nodeps java-1.8.0-openjdk-headless-1.8.0.161-2.b14.el7.x86_64
[root@master ~]# rpm -e --nodeps java-1.7.0-openjdk-1.7.0.171-2.6.13.2.el7.x86_64
[root@master ~]# rpm -e --nodeps java-1.8.0-openjdk-1.8.0.161-2.b14.el7.x86_64
4)查看是否已经删除完毕
[root@master ~]# java -version
bash: /usr/bin/java: 没有那个文件或目录
3.2.2 下载并安装最新版本的jdk
jdk下载可分成两种情况:
A.在虚拟机中借助自带的火狐浏览器,将jdk文件下载到虚拟机中
默认下载到Linux系统的下载文件中
B.将jdk直接下载到本地windows系统,然后通过SecureCRT
等工具导入虚拟机中,本次试验采用该法
[root@,master ~]# rz
rz waiting to receive.
Starting zmodem transfer. Press Ctrl+C to cancel.
Transferring jdk-8u181-linux-x64.tar.gz...
100% 181295 KB 36259 KB/sec 00:00:05 0 Errors
由于本机直接root
用户登录,通过rz
命令将jdk
载入到/root/Home
路径。
将idk安装包转移到系统文件中,可以通过makdir
命令,也可以直接定位到安装文件然后手动转移并修改jdk
路径,本次试验首先在opt
文件下新建一个java
文件,然后将jdk
放入/opt/java
路径下
通过tar -zxvf jdk-8u181-linux-x64.tar.gz
命令解压安装包
[root@master ~]# cd /opt/java
[root@master java]# tar -zxvf jdk-8u181-linux-x64.tar.gz
3.2.3 环境变量设置
1)通过vi /etc/profile
或者vim /etc/profile
进入profile
文件的编辑状态(vim相关编辑命令请自行百度),也可直接在Linux
系统下直接进入/etc/profile
路径进行操作。最后,将以下内容复制到profile
文件的最后。
#java environment
export JAVA_HOME=/opt/java/jdk1.8.0_181
export CLASSPATH=.:${JAVA_HOME}/jre/lib/rt.jar:${JAVA_HOME}/lib/dt.jar:${JAVA_HOME}/lib/tools.jar
export PATH=$PATH:${JAVA_HOME}/bin
2)输入source /etc/profile
使得刚才的修改生效,同时java -version
再次查看java
是否已经完成安装
[root@master ~]# source /etc/profile
[root@master ~]# java -version
java version "1.8.0_181"
Java(TM) SE Runtime Environment (build 1.8.0_181-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.181-b13, mixed mode)
3.3 SSH免密登陆
3.3.1 准备工作
1)查看是否安装SSH,一般Linux系统默认安装
[root@master ~]# rpm -qa |grep ssh
openssh-clients-7.4p1-16.el7.x86_64
libssh2-1.4.3-10.el7_2.1.x86_64
openssh-7.4p1-16.el7.x86_64
openssh-server-7.4p1-16.el7.x86_64
2)借助vi /etc/host
修改机器名和IP
master 192.168.31.237
slave1 192.168.31.238
slave2 192.168.31.239
3.3.2 设置免密登陆
1)生成公钥与私钥
[root@master ~]# ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa): y
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in y.
Your public key has been saved in y.pub.
The key fingerprint is:
SHA256:+cCJUbTOrw0ON9gjKK7D5rsdNRcWlrNFXxpZpDY2jM4 root@slave2
The key's randomart image is:
+---[RSA 2048]----+
| +=. .++ |
| .+.o+.= |
| .o=. X |
| .B+oo o |
| o..SE |
| ..oo + |
|. ... + * o |
|.+... = * |
|+*+. o . |
+----[SHA256]-----+
[root@master ~]#
2)合并公钥到authorized_keys
文件,在master
服务器,进入/root/.ssh
目录,通过SSH
命令合并
[root@master ~]# cd /root/.ssh
[root@master ~]# cat id_rsa.pub>> authorized_keys
[root@master ~]# ssh root@192.168.31.238 cat ~/.ssh/id_rsa.pub >> authorized_keys
[root@master ~]# ssh root@192.168.31.239 cat ~/.ssh/id_rsa.pub >> authorized_keys
3)把master
服务器的authorized_keys
、known_hosts
复制到slave
服务器的/root/.ssh
目录。
scp -r /root/.ssh/authorized_keys root@192.168.31.238:/root/.ssh/
scp -r /root/.ssh/known_hosts root@192.168.31.238:/root/.ssh/
scp -r /root/.ssh/authorized_keys root@192.168.31.239:/root/.ssh/
scp -r /root/.ssh/known_hosts root@192.168.31.239:/root/.ssh/
4)验证是否可以免密登陆其他机器
[root@master ~]# ssh slave1
Last login: Mon Oct 1 16:43:06 2018
[root@slave1 ~]# ssh master
Last login: Mon Oct 1 16:43:58 2018 from slave1
[root@master ~]# ssh slave2
Last login: Mon Oct 1 16:43:33 2018
bug
如何解决虚拟机无法连接外网?
[root@master ~]# ifconfig
ens33: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
# 未生成ip地址
inet6 fe80::20c:29ff:fe72:641f prefixlen 64 scopeid 0x20<link>
ether 00:0c:29:72:64:1f txqueuelen 1000 (Ethernet)
RX packets 12335 bytes 1908583 (1.8 MiB)
RX errors 0 dropped 868 overruns 0 frame 0
TX packets 11 bytes 828 (828.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
inet6 ::1 prefixlen 128 scopeid 0x10<host>
loop txqueuelen 1000 (Local Loopback)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
virbr0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500
inet 192.168.122.1 netmask 255.255.255.0 broadcast 192.168.122.255
ether 52:54:00:cb:c7:a8 txqueuelen 1000 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
[root@master ~]# service network start
Restarting network (via systemctl): Job for network.service failed because the control process exited with error code. See "systemctl status network.service" and "journalctl -xe" for details.
[失败]
[root@master ~]# systemctl status network.service
● network.service - LSB: Bring up/down networking
Loaded: loaded (/etc/rc.d/init.d/network; bad; vendor preset: disabled)
Active: failed (Result: exit-code) since 三 2018-12-05 16:59:04 CST; 1min 7s ago
Docs: man:systemd-sysv-generator(8)
Process: 4546 ExecStart=/etc/rc.d/init.d/network start (code=exited, status=1/FAILURE)
12月 05 16:59:04 master network[4546]: RTNETLINK answers: File exists
12月 05 16:59:04 master network[4546]: RTNETLINK answers: File exists
12月 05 16:59:04 master network[4546]: RTNETLINK answers: File exists
12月 05 16:59:04 master network[4546]: RTNETLINK answers: File exists
12月 05 16:59:04 master network[4546]: RTNETLINK answers: File exists
12月 05 16:59:04 master network[4546]: RTNETLINK answers: File exists
12月 05 16:59:04 master systemd[1]: network.service: control process exited, code...=1
12月 05 16:59:04 master systemd[1]: Failed to start LSB: Bring up/down networking.
12月 05 16:59:04 master systemd[1]: Unit network.service entered failed state.
12月 05 16:59:04 master systemd[1]: network.service failed.
Hint: Some lines were ellipsized, use -l to show in full.
[root@master ~]# tail -f /var/log/messages
Dec 5 16:59:04 master network: RTNETLINK answers: File exists
Dec 5 16:59:04 master network: RTNETLINK answers: File exists
Dec 5 16:59:04 master systemd: network.service: control process exited, code=exited status=1
Dec 5 16:59:04 master systemd: Failed to start LSB: Bring up/down networking.
Dec 5 16:59:04 master systemd: Unit network.service entered failed state.
Dec 5 16:59:04 master systemd: network.service failed.
Dec 5 17:00:01 master systemd: Started Session 10 of user root.
Dec 5 17:00:01 master systemd: Starting Session 10 of user root.
Dec 5 17:01:01 master systemd: Started Session 11 of user root.
Dec 5 17:01:01 master systemd: Starting Session 11 of user root.
[root@master ~]# cat /var/log/messages | grep network
Dec 5 14:09:20 master kernel: drop_monitor: Initializing network drop monitor service
Dec 5 14:09:43 master systemd: Starting Import network configuration from initramfs...
Dec 5 14:09:43 master systemd: Started Import network configuration from initramfs.
Dec 5 14:10:01 master systemd: Starting LSB: Bring up/down networking...
Dec 5 14:10:08 master network: 正在打开环回接口: [ 确定 ]
Dec 5 14:10:09 master network: 正在打开接口 ens33: ERROR : [/etc/sysconfig/network-scripts/ifup-eth] Error, some other host (70:85:C2:03:8E:AF) already uses address 192.168.31.237.
Dec 5 14:10:09 master /etc/sysconfig/network-scripts/ifup-eth: Error, some other host (70:85:C2:03:8E:AF) already uses address 192.168.31.237.
Dec 5 14:10:09 master network: [失败]
Dec 5 14:10:09 master systemd: network.service: control process exited, code=exited status=1
Dec 5 14:10:09 master systemd: Failed to start LSB: Bring up/down networking.
Dec 5 14:10:09 master systemd: Unit network.service entered failed state.
Dec 5 14:10:09 master systemd: network.service failed.
Dec 5 14:11:46 master pulseaudio: GetManagedObjects() failed: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
solution
解决方法依具体情况而定,大致分为以下几种:
# 01 修改ifcfg-ens33文件(网络上很多资料提示需要将ens33变更为eth0,其实大可不必)
[root@master ~]# cd /etc/sysconfig/network-scripts
[root@master network-scripts]# ls
ifcfg-ens33 ifdown-isdn ifup ifup-plip ifup-tunnel
ifcfg-lo ifdown-post ifup-aliases ifup-plusb ifup-wireless
ifdown ifdown-ppp ifup-bnep ifup-post init.ipv6-global
ifdown-bnep ifdown-routes ifup-eth ifup-ppp network-functions
ifdown-eth ifdown-sit ifup-ib ifup-routes network-functions-ipv6
ifdown-ib ifdown-Team ifup-ippp ifup-sit
ifdown-ippp ifdown-TeamPort ifup-ipv6 ifup-Team
ifdown-ipv6 ifdown-tunnel ifup-isdn ifup-TeamPort
[root@master network-scripts]# vi ifcfg-ens33
TYPE="Ethernet"
PROXY_METHOD="none"
BROWSER_ONLY="no"
#设置静态IP
BOOTPROTO="static"
DEFROUTE="yes"
IPV4_FAILURE_FATAL="no"
IPV6INIT="yes"
IPV6_AUTOCONF="yes"
IPV6_DEFROUTE="yes"
IPV6_FAILURE_FATAL="no"
IPV6_ADDR_GEN_MODE="stable-privacy"
NAME="ens33"
UUID="cecb46d8-4d6e-4678-b2f4-445b9f09c73d"
DEVICE="ens33"
#开机自启
ONBOOT="yes"
IPADDR=192.168.31.237
NETMASK=255.255.255.0
GATEWAY=192.168.31.1
DNS1=192.168.31.1
# 02 考虑到当前IP被占用的情况,设置新的静态IP地址,包括/etc/hosts和/etc/sysconfig/network-scripts/ifcfg-ens33
[root@master ~]# vi /etc/hostname
[root@master ~]# vi /etc/sysconfig/network-scripts/ifcfg-ens33
[root@master ~]# service network restart
Restarting network (via systemctl): [ 确定 ]
# 03 关闭NetworkManager管理套件
[root@master ~]# systemctl stop NetworkManager
[root@master ~]# systemctl disable NetworkManager
Removed symlink /etc/systemd/system/multi-user.target.wants/NetworkManager.service.
Removed symlink /etc/systemd/system/dbus-org.freedesktop.NetworkManager.service.
Removed symlink /etc/systemd/system/dbus-org.freedesktop.nm-dispatcher.service.
[root@master ~]# systemctl restart network
# 通过上述方式最终成功解决
[root@master ~]# ifconfig
ens33: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 192.168.31.237 netmask 255.255.255.0 broadcast 192.168.31.255
inet6 fe80::20c:29ff:fe72:641f prefixlen 64 scopeid 0x20<link>
ether 00:0c:29:72:64:1f txqueuelen 1000 (Ethernet)
RX packets 341 bytes 32414 (31.6 KiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 61 bytes 7540 (7.3 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
inet6 ::1 prefixlen 128 scopeid 0x10<host>
loop txqueuelen 1000 (Local Loopback)
RX packets 2 bytes 108 (108.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 2 bytes 108 (108.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
virbr0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500
inet 192.168.122.1 netmask 255.255.255.0 broadcast 192.168.122.255
ether 52:54:00:cb:c7:a8 txqueuelen 1000 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
3.4 Hadoop2.7.2安装及集群配置
3.4.1 Hadoop安装
1)与jdk文件处理方式类似,导入并解压到/opt/Hadoop路径下
2)配置hadoop环境变量
[root@master ~]# vim /etc/profile
export HADOOP_HOME=/opt/hadoop/hadoop2.7.2
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
:x
[root@master ~]# source /etc/profile
3)验证是否完成安装。
[root@master ~]# hadoop version
Hadoop 2.7.2
Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r b165c4fe8a74265c792ce23f546c64604acf0e41
Compiled by jenkins on 2016-01-26T00:08Z
Compiled with protoc 2.5.0
From source with checksum d0fda26633fa762bff87ec759ebe689c
This command was run using /opt/hadoop/hadoop-2.7.2/share/hadoop/common/hadoop-common-2.7.2.jar
3.4.2 伪分布式集群配置
1)在/opt/hadoop
目录下创建数据存放的文件夹,tmp、dfs、dfs/data、dfs/name
2)进入hadoop
配置文件目录。
[root@master ~]# cd /opt/hadoop/hadoop-2.7.2/etc/hadoop
[root@master hadoop]# ls
capacity-scheduler.xml httpfs-env.sh mapred-env.sh
configuration.xsl httpfs-log4j.properties mapred-queues.xml.template
container-executor.cfg httpfs-signature.secret mapred-site.xml.template
core-site.xml httpfs-site.xml slaves
hadoop-env.cmd kms-acls.xml ssl-client.xml.example
hadoop-env.sh kms-env.sh ssl-server.xml.example
hadoop-metrics2.properties kms-log4j.properties yarn-env.cmd
hadoop-metrics.properties kms-site.xml yarn-env.sh
hadoop-policy.xml log4j.properties yarn-site.xml
hdfs-site.xml mapred-env.cmd
3)配置core-site.xml
文件
vi core-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://master:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/hadoop/tmp</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131702</value>
</property>
</configuration>
4)配置hdfs-site.xml
文件
vi hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///opt/hadoop/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///opt/hadoop/dfs/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>master:50090</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
</configuration>
5)配置 mapred-site.xml
文件
vi mapred-site.xml.template
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
<final>true</final>
</property>
<property>
<name>mapreduce.jobtracker.http.address</name>
<value>master:50030</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>master:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>master:19888</value>
</property>
<property>
<name>mapred.job.tracker</name>
<value>http://master:9001</value>
</property>
</configuration>
6)配置 yarn-site.xml
文件
vi yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property> <name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>master:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>master:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>master:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>master:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>master:8088</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>master</value>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>2048</value>
</property>
</configuration>
7)配置hadoop-env.sh
和yarn-env.sh
的JAVA_HOME
[root@master hadoop]# vi hadoop-env.sh
[root@master hadoop]# vi yarn-env.sh
8)配置slaves
,增加两个slave
节点。
#删除默认的localhost
slave1
slave2
9)通过scp
将master服务器上配置好的Hadoop复制到各个节点对应位置上。
[root@master hadoop]# scp -r /opt/hadoop 192.168.10.132:/opt/
[root@master hadoop]# scp -r /opt/hadoop 192.168.10.133:/opt/
3.4.3 启动hadoop
1)从master
服务器上进行hadoop
文件目录,并初始化
[root@master ~]# cd /opt/hadoop/hadoop-2.7.2
[root@master hadoop-2.7.2]# bin/hdfs namenode –format
2)启动/终止命令
sbin/start-dfs.sh
sbin/start-yarn.sh
sbin/stop-dfs.sh
sbin/stop-yarn.sh
3)输入jps
查看相关信息。
- master
[root@master hadoop-2.7.2]# jps
8976 Jps
8710 ResourceManager
8559 SecondaryNameNode
- slave
[root@slave1 ~]# jps
4945 Jps
3703 DataNode
4778 NodeManager
4)端口查看
3.5 Spark安装及环境配置
3.5.1 Scala安装
3.5.2 Spark安装
3.5.3 Spark启动
关闭/开启 防火墙。
# 开启防火墙
[root@master ~]# systemctl start firewalld.service
# 关闭防火墙
[root@master ~]# systemctl stop firewalld.service
# 开启开机启动
[root@master ~]# systemctl enable firewalld.service
# 关闭开机启动
[root@master ~]# systemctl disable firewalld.service
Removed symlink /etc/systemd/system/multi-user.target.wants/firewalld.service.
Removed symlink /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service.
启动Hadoop节点。
[root@master ~]# cd /opt/hadoop/hadoop-2.7.2/
[root@master hadoop-2.7.2]# sbin/start-all.sh
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
Starting namenodes on [master]
master: starting namenode, logging to /opt/hadoop/hadoop-2.7.2/logs/hadoop-root-namenode-master.out
slave1: starting datanode, logging to /opt/hadoop/hadoop-2.7.2/logs/hadoop-root-datanode-slave1.out
slave2: starting datanode, logging to /opt/hadoop/hadoop-2.7.2/logs/hadoop-root-datanode-slave2.out
Starting secondary namenodes [master]
master: starting secondarynamenode, logging to /opt/hadoop/hadoop-2.7.2/logs/hadoop-root-secondarynamenode-master.out
starting yarn daemons
starting resourcemanager, logging to /opt/hadoop/hadoop-2.7.2/logs/yarn-root-resourcemanager-master.out
slave2: starting nodemanager, logging to /opt/hadoop/hadoop-2.7.2/logs/yarn-root-nodemanager-slave2.out
slave1: starting nodemanager, logging to /opt/hadoop/hadoop-2.7.2/logs/yarn-root-nodemanager-slave1.out
[root@master hadoop-2.7.2]# jps
3648 SecondaryNameNode
4099 Jps
3801 ResourceManager
启动Spark。
[root@master hadoop-2.7.2]# cd /opt/spark/spark-2.3.1-bin-hadoop2.7
[root@master spark-2.3.1-bin-hadoop2.7]# sbin/start-all.sh
starting org.apache.spark.deploy.master.Master, logging to /opt/spark/spark-2.3.1-bin-hadoop2.7/logs/spark-root-org.apache.spark.deploy.master.Master-1-master.out
slave1: starting org.apache.spark.deploy.worker.Worker, logging to /opt/spark/spark-2.3.1-bin-hadoop2.7/logs/spark-root-org.apache.spark.deploy.worker.Worker-1-slave1.out
slave2: starting org.apache.spark.deploy.worker.Worker, logging to /opt/spark/spark-2.3.1-bin-hadoop2.7/logs/spark-root-org.apache.spark.deploy.worker.Worker-1-slave2.out
Spark集群测试(master节点)。
3.6 Zookeeper安装及环境配置
3.6.1 Zookeeper安装
安装文件解压到/opt/zookeeper
路径下。
[root@master ~]# tar -xvf zookeeper-3.4.10.tar.gz
[root@master ~]# mv zookeeper-3.4.10 /opt/zookeeper
[root@master ~]# mv zookeeper-3.4.10 zookeeper3.4
在zookeeper
路径下创建数据文件和日志文件。
[root@master ~]# mkdir /opt/zookeeper/data
[root@master ~]# mkdir /opt/zookeeper/log
# 在/opt/zookeeper/data目录下创建myid文件
[root@master ~]# touch myid
[root@master ~]# vi myid
# 设定值为1
1
修改zoo.cfg
文件。
[root@master ~]# cd /opt/zookeeper/zookeeper3.4/conf
# 创建zoo.cfg文件
[root@master conf]# cp zoo_sample.cfg zoo.cfg
[root@master conf]# vi zoo.cfg
具体修改如下所示:
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
#dataDir=/tmp/zookeeper
# the port at which the clients will connect
# 默认端口是2181,但本机端口被占用,遂变更为2381
clientPort=2381
# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1
quorumListenOnAllIPs=ture
# 新增内容
dataDir=/opt/zookeeper/data
dataLogDir=/opt/zookeeper/log
server.1=master:2888:3888
server.2=slave1:2888:3888
server.3=slave2:2888:3888
将Zookeeper传递给slave1和slave2,/opt/zookeeper/data
路径下的myid
与zoo.cfg
内容须保持一致。
[root@master conf]# scp -r /opt/zookeeper root@slave1:/opt
[root@master conf]# scp -r /opt/zookeeper root@slave2:/opt
3.6.2 Zookeeper环境配置
编辑环境变量。
[root@master ~]# vi /etc/profile
修改内容如下:
export ZK_HOME=/opt/zookeeper/zookeeper3.4
PATH=$PATH:${JAVA_HOME}/bin:$HADOOP_HOME/sbin:$HADOOP_HOME/bin:${SCALA_HOME}/bin:${SPARK_HOME}/bin:${ZK_HOME}/bin:$PATH
使修改生效。
[root@master ~]# source /etc/profile
3.6.3 Zookeeper启动
完成Zookeeper的安装配置后,在每一台机器上启动。注:Zookeeper遵循选举制
slave1
[root@slave1 ~]# cd /opt/zookeeper/zookeeper3.4/bin
[root@slave1 bin]# ./zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /opt/zookeeper/zookeeper3.4/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[root@slave1 bin]# ./zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /opt/zookeeper/zookeeper3.4/bin/../conf/zoo.cfg
Mode: leader
slave2
[root@slave2 ~]# cd /opt/zookeeper/zookeeper3.4/bin
[root@slave2 bin]# ./zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /opt/zookeeper/zookeeper3.4/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[root@slave2 bin]# ./zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /opt/zookeeper/zookeeper3.4/bin/../conf/zoo.cfg
Mode: follower
master
[root@master ~]# cd /opt/zookeeper/zookeeper3.4/bin
[root@master bin]# ./zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /opt/zookeeper/zookeeper3.4/bin/../conf/zoo.cfg
Starting zookeeper ... already running as process 3944.
[root@master bin]# ./zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /opt/zookeeper/zookeeper3.4/bin/../conf/zoo.cfg
Mode: follower
3.7 HBase安装及环境配置
3.7.1 HBase安装
将HBase安装包加载至/opt/HBase
路径下,并将其解压到当前路径。
[root@master ~]# mkdir /opt/HBase
[root@master ~]# cd /opt/HBase
[root@master HBase]# rz
rz waiting to receive.
Starting zmodem transfer. Press Ctrl+C to cancel.
Transferring hbase-2.0.2-bin.tar.gz...
100% 150220 KB 10730 KB/sec 00:00:14 0 Errors
[root@master ~]# tar -xvf hbase-2.0.2-bin.tar.gz
3.7.2 配置文件修改
切换到/opt/hbase/hbase-2.0.2/conf
路径。
修改hbase-env.sh
,在文件中添加以下配置。
export JAVA_HOME=/opt/java/jdk1.8.0_181
export HADOOP_HOME=/opt/hadoop/hadoop-2.7.2
export HBASE_HOME=/opt/HBase/hbase-2.0.2
export HBASE_CLASSPATH=/opt/hadoop/hadoop-2.7.2/etc/hadoop
export HBASE_PID_DIR=/root/hbase/pids
export HBASE_MANAGES_ZK=false
修改hbase-site.xml
,在文件中添加以下配置。
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://master:9000/hbase</value>
<description>The directory shared byregion servers.</description>
</property>
<!-- hbase端口 -->
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2181</value>
</property>
<!-- 超时时间 -->
<property>
<name>zookeeper.session.timeout</name>
<value>120000</value>
</property>
<!--防止服务器时间不同步出错 -->
<property>
<name>hbase.master.maxclockskew</name>
<value>150000</value>
</property>
<!-- 集群主机配置 -->
<property>
<name>hbase.zookeeper.quorum</name>
<value>master,slave3,slave4</value>
</property>
<!-- 路径存放 -->
<property>
<name>hbase.tmp.dir</name>
<value>/root/hbase/tmp</value>
</property>
<!-- true表示分布式 -->
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<!-- 指定master -->
<property>
<name>hbase.master</name>
<value>master:60000</value>
</property>
</configuration>
修改regionservers
,指定hbase的主从。
master
slave1
slave2
将这些配置传输到其他机器上。
scp -r /opt/HBase root@slave1:/opt
scp -r /opt/HBase root@slave2:/opt
配置环境变量。
[root@master ~]# vi /etc/profile
配置文件内容如下所示:
export HBASE_HOME=/opt/HBase/hbase-2.0.2
export PATH=$PATH:${JAVA_HOME}/bin:$HADOOP_HOME/sbin:$HADOOP_HOME/bin:${SCALA_HOME}/bin:${SPARK_HOME}/bin:${ZK_HOME}/bin:${HBASE_HOME}/bin:$PATH
使配置文件生效。
[root@master ~]# source /etc/profile
查看安装结果。
[root@master ~]# hbase version
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/HBase/hbase-2.0.2/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hadoop/hadoop-2.7.2/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
HBase 2.0.2
Source code repository git://ve0524/home/stack/hbase.git revision=1cfab033e779df840d5612a85277f42a6a4e8172
Compiled by stack on Tue Aug 28 20:50:40 PDT 2018
From source with checksum 80b9ac6ea66f2b5cf6bc3ce886f3fc67
3.7.3 启动HBase
Hbase是基于hadoop提供的分布式文件系统的,所以启动Hbase之前,先确保hadoop在正常运行,另外Hbase还依赖于zookkeeper,本来我们可以用hbase自带的zookeeper,但是我们上面的配置启用的是我们自己的zookeeper集群,所以在启动hbase前,还要确保zokeeper已经正常运行。Hbase可以只在hadoop的某个namenode节点上安装,也可以在所有的hadoop节点上安装,但是启动的时候只需要在一个节点上启动就行了
,本例中,我在master、slave1、slave2都安装了Hbase,启动的时候只需要在master上启动就OK。
[root@master ~]# cd /opt/HBase/hbase-2.0.2/bin
[root@master bin]# start-hbase.sh
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/HBase/hbase-2.0.2/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hadoop/hadoop-2.7.2/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
running master, logging to /opt/HBase/hbase-2.0.2/logs/hbase-root-master-master.out
slave4: running regionserver, logging to /opt/HBase/hbase-2.0.2/logs/hbase-root-regionserver-slave4.out
slave3: running regionserver, logging to /opt/HBase/hbase-2.0.2/logs/hbase-root-regionserver-slave3.out
bug
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/HBase/hbase-2.0.2/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hadoop/hadoop-2.7.2/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
solution
发生jar包冲突了,移除其中一个即可。
[root@master bin]#cd /opt/HBase/hbase-2.0.2/lib
[root@master lib]#rm slf4j-log4j12-1.7.25.jar
rm:是否删除普通文件 "slf4j-log4j12-1.7.25.jar"?yes
[root@master lib]#cd /opt/HBase/hbase-2.0.2/bin
[root@master bin]#start-hbase.sh
running master, logging to /opt/HBase/hbase-2.0.2/logs/hbase-root-master-master.out
slave3: running regionserver, logging to /opt/HBase/hbase-2.0.2/logs/hbase-root-regionserver-slave3.out
slave4: running regionserver, logging to /opt/HBase/hbase-2.0.2/logs/hbase-root-regionserver-slave4.out
[root@master bin]#jps
9027 HMaster
9141 Jps
3544 ResourceManager
3944 QuorumPeerMain
3849 Master
3387 SecondaryNameNode
3196 NameNode
状态信息查看,访问IP:16030或者16010
。
此外也可以通过命令行查看状态。
# 切换路径
[root@master bin]# cd /opt/HBase/hbase-2.0.2/bin
# 进入hbase命令窗口
[root@master bin]# ./hbase shell
HBase Shell
Use "help" to get list of supported commands.
Use "exit" to quit this interactive shell.
Version 2.0.2, r1cfab033e779df840d5612a85277f42a6a4e8172, Tue Aug 28 20:50:40 PDT 2018
Took 0.0268 seconds
# 状态查看
hbase(main):001:0> status
1 active master, 0 backup masters, 3 servers, 0 dead, 0.6667 average load
Took 1.2047 seconds
# 退出
hbase(main):002:0> exit
bug
打开webUI界面后,提示The RegionServer is initializing!
信息。
solution查看hdfs safe mode
[root@master bin]#
hadoop dfsadmin -safemode get
退出hdfs safe mode
[root@master bin]#
hadoop dfsadmin -safemode leave
3.8 Hive安装及环境配置
Hive可以选择只安装在master节点上,而不必每一台都部署。
3.8.1 mysql安装
由于mysql是Hive默认的元数据,所以需先行安装Mysql。
# 检查是否已安装mysql
[root@master ~]# rpm -qa | grep -i mysql
mysql-community-libs-5.6.42-2.el7.x86_64
mysql-community-server-5.6.42-2.el7.x86_64
mysql-community-release-el7-5.noarch
mysql-community-client-5.6.42-2.el7.x86_64
mysql-community-common-5.6.42-2.el7.x86_64
# 如果需要卸载
[root@master ~]# rpm -e --nodeps mysql-community-libs-5.6.42-2.el7.x86_64
[root@master ~]# rpm -e --nodeps mysql-community-server-5.6.42-2.el7.x86_64
[root@master ~]# rpm -e --nodeps mysql-community-release-el7-5.noarch
[root@master ~]# rpm -e --nodeps mysql-community-client-5.6.42-2.el7.x86_64
[root@master ~]# rpm -e --nodeps mysql-community-common-5.6.42-2.el7.x86_64
# 清理文件夹,必须通过该方式才能彻底卸载
[root@master ~]# find / -name mysql
/etc/selinux/targeted/active/modules/100/mysql
/etc/selinux/targeted/tmp/modules/100/mysql
/var/lib/mysql
/var/lib/mysql/mysql
/usr/lib64/mysql
/usr/share/mysql
/opt/hive/apache-hive-2.3.4-bin/scripts/metastore/upgrade/mysql
[root@master ~]# rm -rf /usr/share/mysql
[root@master ~]# rm -rf /var/lib/mysql
[root@master ~]# rm -rf /var/lib/mysql/mysql
# 安装mysql
[root@master ~]# rpm -qa | grep mysql
[root@master ~]# wget http://repo.mysql.com/mysql-community-release-el7-5.noarch.rpm
--2018-12-06 16:46:05-- http://repo.mysql.com/mysql-community-release-el7-5.noarch.rpm
正在解析主机 repo.mysql.com (repo.mysql.com)... 23.220.145.218
正在连接 repo.mysql.com (repo.mysql.com)|23.220.145.218|:80... 已连接。
已发出 HTTP 请求,正在等待回应... 200 OK
长度:6140 (6.0K) [application/x-redhat-package-manager]
正在保存至: “mysql-community-release-el7-5.noarch.rpm”
100%[===============================================================================================================================================================================================>] 6,140 --.-K/s 用时 0.003s
2018-12-06 16:46:06 (1.81 MB/s) - 已保存 “mysql-community-release-el7-5.noarch.rpm” [6140/6140])
[root@master ~]# sudo rpm -ivh mysql-community-release-el7-5.noarch.rpm
准备中... ################################# [100%]
正在升级/安装...
1:mysql-community-release-el7-5 ################################# [100%]
[root@master ~]# sudo yum install mysql-server
已加载插件:fastestmirror, langpacks
Loading mirror speeds from cached hostfile
* base: mirrors.aliyun.com
* extras: mirrors.163.com
* updates: mirrors.aliyun.com
base | 3.6 kB 00:00:00
extras | 3.4 kB 00:00:00
mysql-connectors-community | 2.5 kB 00:00:00
mysql-tools-community | 2.5 kB 00:00:00
mysql56-community | 2.5 kB 00:00:00
updates | 3.4 kB 00:00:00
(1/3): mysql-connectors-community/x86_64/primary_db | 29 kB 00:00:00
(2/3): mysql-tools-community/x86_64/primary_db | 48 kB 00:00:00
(3/3): mysql56-community/x86_64/primary_db | 209 kB 00:00:01
正在解决依赖关系
--> 正在检查事务
---> 软件包 mysql-community-server.x86_64.0.5.6.42-2.el7 将被 安装
--> 正在处理依赖关系 mysql-community-common(x86-64) = 5.6.42-2.el7,它被软件包 mysql-community-server-5.6.42-2.el7.x86_64 需要
--> 正在处理依赖关系 mysql-community-client(x86-64) >= 5.6.10,它被软件包 mysql-community-server-5.6.42-2.el7.x86_64 需要
--> 正在处理依赖关系 perl(DBI),它被软件包 mysql-community-server-5.6.42-2.el7.x86_64 需要
--> 正在检查事务
---> 软件包 mysql-community-client.x86_64.0.5.6.42-2.el7 将被 安装
--> 正在处理依赖关系 mysql-community-libs(x86-64) >= 5.6.10,它被软件包 mysql-community-client-5.6.42-2.el7.x86_64 需要
---> 软件包 mysql-community-common.x86_64.0.5.6.42-2.el7 将被 安装
---> 软件包 perl-DBI.x86_64.0.1.627-4.el7 将被 安装
--> 正在处理依赖关系 perl(RPC::PlServer) >= 0.2001,它被软件包 perl-DBI-1.627-4.el7.x86_64 需要
--> 正在处理依赖关系 perl(RPC::PlClient) >= 0.2000,它被软件包 perl-DBI-1.627-4.el7.x86_64 需要
--> 正在检查事务
---> 软件包 mariadb-libs.x86_64.1.5.5.56-2.el7 将被 取代
---> 软件包 mysql-community-libs.x86_64.0.5.6.42-2.el7 将被 舍弃
---> 软件包 perl-PlRPC.noarch.0.0.2020-14.el7 将被 安装
--> 正在处理依赖关系 perl(Net::Daemon) >= 0.13,它被软件包 perl-PlRPC-0.2020-14.el7.noarch 需要
--> 正在处理依赖关系 perl(Net::Daemon::Test),它被软件包 perl-PlRPC-0.2020-14.el7.noarch 需要
--> 正在处理依赖关系 perl(Net::Daemon::Log),它被软件包 perl-PlRPC-0.2020-14.el7.noarch 需要
--> 正在处理依赖关系 perl(Compress::Zlib),它被软件包 perl-PlRPC-0.2020-14.el7.noarch 需要
--> 正在检查事务
---> 软件包 perl-IO-Compress.noarch.0.2.061-2.el7 将被 安装
--> 正在处理依赖关系 perl(Compress::Raw::Zlib) >= 2.061,它被软件包 perl-IO-Compress-2.061-2.el7.noarch 需要
--> 正在处理依赖关系 perl(Compress::Raw::Bzip2) >= 2.061,它被软件包 perl-IO-Compress-2.061-2.el7.noarch 需要
---> 软件包 perl-Net-Daemon.noarch.0.0.48-5.el7 将被 安装
--> 正在检查事务
---> 软件包 perl-Compress-Raw-Bzip2.x86_64.0.2.061-3.el7 将被 安装
---> 软件包 perl-Compress-Raw-Zlib.x86_64.1.2.061-4.el7 将被 安装
--> 解决依赖关系完成
依赖关系解决
=========================================================================================================================================================================================================================================
Package 架构 版本 源 大小
=========================================================================================================================================================================================================================================
正在安装:
mysql-community-libs x86_64 5.6.42-2.el7 mysql56-community 2.0 M
替换 mariadb-libs.x86_64 1:5.5.56-2.el7
mysql-community-server x86_64 5.6.42-2.el7 mysql56-community 59 M
为依赖而安装:
mysql-community-client x86_64 5.6.42-2.el7 mysql56-community 20 M
mysql-community-common x86_64 5.6.42-2.el7 mysql56-community 257 k
perl-Compress-Raw-Bzip2 x86_64 2.061-3.el7 base 32 k
perl-Compress-Raw-Zlib x86_64 1:2.061-4.el7 base 57 k
perl-DBI x86_64 1.627-4.el7 base 802 k
perl-IO-Compress noarch 2.061-2.el7 base 260 k
perl-Net-Daemon noarch 0.48-5.el7 base 51 k
perl-PlRPC noarch 0.2020-14.el7 base 36 k
事务概要
=========================================================================================================================================================================================================================================
安装 2 软件包 (+8 依赖软件包)
总下载量:82 M
Is this ok [y/d/N]: y
Downloading packages:
警告:/var/cache/yum/x86_64/7/mysql56-community/packages/mysql-community-common-5.6.42-2.el7.x86_64.rpm: 头V3 DSA/SHA1 Signature, 密钥 ID 5072e1f5: NOKEY ] 231 kB/s | 259 kB 00:06:03 ETA
mysql-community-common-5.6.42-2.el7.x86_64.rpm 的公钥尚未安装
(1/10): mysql-community-common-5.6.42-2.el7.x86_64.rpm | 257 kB 00:00:01
(2/10): mysql-community-client-5.6.42-2.el7.x86_64.rpm | 20 MB 00:00:07
警告:/var/cache/yum/x86_64/7/base/packages/perl-Compress-Raw-Bzip2-2.061-3.el7.x86_64.rpm: 头V3 RSA/SHA256 Signature, 密钥 ID f4a80eb5: NOKEY ] 2.6 MB/s | 20 MB 00:00:23 ETA
perl-Compress-Raw-Bzip2-2.061-3.el7.x86_64.rpm 的公钥尚未安装
(3/10): perl-Compress-Raw-Bzip2-2.061-3.el7.x86_64.rpm | 32 kB 00:00:00
(4/10): perl-Compress-Raw-Zlib-2.061-4.el7.x86_64.rpm | 57 kB 00:00:00
(5/10): perl-IO-Compress-2.061-2.el7.noarch.rpm | 260 kB 00:00:00
(6/10): perl-Net-Daemon-0.48-5.el7.noarch.rpm | 51 kB 00:00:00
(7/10): perl-PlRPC-0.2020-14.el7.noarch.rpm | 36 kB 00:00:00
(8/10): perl-DBI-1.627-4.el7.x86_64.rpm | 802 kB 00:00:00
(9/10): mysql-community-libs-5.6.42-2.el7.x86_64.rpm | 2.0 MB 00:00:24
(10/10): mysql-community-server-5.6.42-2.el7.x86_64.rpm | 59 MB 00:02:21
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
总计 567 kB/s | 82 MB 00:02:28
从 file:/etc/pki/rpm-gpg/RPM-GPG-KEY-mysql 检索密钥
导入 GPG key 0x5072E1F5:
用户ID : "MySQL Release Engineering <mysql-build@oss.oracle.com>"
指纹 : a4a9 4068 76fc bd3c 4567 70c8 8c71 8d3b 5072 e1f5
软件包 : mysql-community-release-el7-5.noarch (installed)
来自 : file:/etc/pki/rpm-gpg/RPM-GPG-KEY-mysql
是否继续?[y/N]:y
从 file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-7 检索密钥
导入 GPG key 0xF4A80EB5:
用户ID : "CentOS-7 Key (CentOS 7 Official Signing Key) <security@centos.org>"
指纹 : 6341 ab27 53d7 8a78 a7c2 7bb1 24c6 a8a7 f4a8 0eb5
软件包 : centos-release-7-5.1804.el7.centos.x86_64 (@anaconda)
来自 : /etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-7
是否继续?[y/N]:y
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
警告:RPM 数据库已被非 yum 程序修改。
** 发现 2 个已存在的 RPM 数据库问题, 'yum check' 输出如下:
icedtea-web-1.7.1-1.el7.x86_64 有缺少的需求 java-1.8.0-openjdk
jline-1.0-8.el7.noarch 有缺少的需求 java >= ('0', '1.5', None)
正在安装 : mysql-community-common-5.6.42-2.el7.x86_64 1/11
正在安装 : mysql-community-libs-5.6.42-2.el7.x86_64 2/11
正在安装 : mysql-community-client-5.6.42-2.el7.x86_64 3/11
正在安装 : perl-Compress-Raw-Bzip2-2.061-3.el7.x86_64 4/11
正在安装 : 1:perl-Compress-Raw-Zlib-2.061-4.el7.x86_64 5/11
正在安装 : perl-IO-Compress-2.061-2.el7.noarch 6/11
正在安装 : perl-Net-Daemon-0.48-5.el7.noarch 7/11
正在安装 : perl-PlRPC-0.2020-14.el7.noarch 8/11
正在安装 : perl-DBI-1.627-4.el7.x86_64 9/11
正在安装 : mysql-community-server-5.6.42-2.el7.x86_64 10/11
正在删除 : 1:mariadb-libs-5.5.56-2.el7.x86_64 11/11
warning: file /usr/lib64/mysql/plugin/mysql_clear_password.so: remove failed: No such file or directory
warning: file /usr/lib64/mysql/plugin/dialog.so: remove failed: No such file or directory
warning: file /usr/lib64/mysql/libmysqlclient.so.18.0.0: remove failed: No such file or directory
验证中 : mysql-community-libs-5.6.42-2.el7.x86_64 1/11
验证中 : mysql-community-common-5.6.42-2.el7.x86_64 2/11
验证中 : perl-Net-Daemon-0.48-5.el7.noarch 3/11
验证中 : mysql-community-server-5.6.42-2.el7.x86_64 4/11
验证中 : mysql-community-client-5.6.42-2.el7.x86_64 5/11
验证中 : perl-IO-Compress-2.061-2.el7.noarch 6/11
验证中 : 1:perl-Compress-Raw-Zlib-2.061-4.el7.x86_64 7/11
验证中 : perl-DBI-1.627-4.el7.x86_64 8/11
验证中 : perl-Compress-Raw-Bzip2-2.061-3.el7.x86_64 9/11
验证中 : perl-PlRPC-0.2020-14.el7.noarch 10/11
验证中 : 1:mariadb-libs-5.5.56-2.el7.x86_64 11/11
已安装:
mysql-community-libs.x86_64 0:5.6.42-2.el7 mysql-community-server.x86_64 0:5.6.42-2.el7
作为依赖被安装:
mysql-community-client.x86_64 0:5.6.42-2.el7 mysql-community-common.x86_64 0:5.6.42-2.el7 perl-Compress-Raw-Bzip2.x86_64 0:2.061-3.el7 perl-Compress-Raw-Zlib.x86_64 1:2.061-4.el7 perl-DBI.x86_64 0:1.627-4.el7
perl-IO-Compress.noarch 0:2.061-2.el7 perl-Net-Daemon.noarch 0:0.48-5.el7 perl-PlRPC.noarch 0:0.2020-14.el7
替代:
mariadb-libs.x86_64 1:5.5.56-2.el7
完毕!
启动并查看mysql服务状态。
# 启动服务
[root@master ~]# service mysqld start
Redirecting to /bin/systemctl start mysqld.service
# 查看mysql连接状态
[root@master ~]# service mysqld status
Redirecting to /bin/systemctl status mysqld.service
● mysqld.service - MySQL Server
Loaded: loaded (/usr/lib/systemd/system/mysqld.service; enabled; vendor preset: disabled)
Active: active (running) since 五 2018-11-16 20:01:52 CST; 44s ago
Docs: man:mysqld(8)
http://dev.mysql.com/doc/refman/en/using-systemd.html
Process: 20743 ExecStartPre=/usr/bin/mysqld_pre_systemd (code=exited, status=0/SUCCESS)
Main PID: 20766 (mysqld)
Status: "SERVER_OPERATING"
Tasks: 38
CGroup: /system.slice/mysqld.service
└─20766 /usr/sbin/mysqld
11月 16 20:01:45 master systemd[1]: Starting MySQL Server...
11月 16 20:01:52 master systemd[1]: Started MySQL Server.
# 设置免密登陆
[root@master ~]# mysqladmin -u root -p password ‘123456’
# 输入密码时候直接选择回车
Enter password:
Warning: Using a password on the command line interface can be insecure.
# 进入mysql
[root@master ~]# mysql -u root -p
Enter password:
Welcome to the MySQL monitor. Commands end with ; or g.
Your MySQL connection id is 5
Server version: 5.6.42 MySQL Community Server (GPL)
Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or 'h' for help. Type 'c' to clear the current input statement.
mysql> grant all privileges on . to ‘root’@’%’ identified by ‘123456’ # 授权法更改远程连接权限
-> flush privileges # 刷新
-> exit #退出
外部验证连接。
当图标变绿时,则表明连接完成。
3.8.2 Hive安装
将HBase安装包加载至/opt/hive
路径下,并将其解压到当前路径。
[root@master ~]# mkdir /opt/hive
[root@master ~]# cd /opt/hive
[root@master HBase]# rz
rz waiting to receive.
Starting zmodem transfer. Press Ctrl+C to cancel.
Transferring apache-hive-2.3.4-bin...
100% 272180 KB 10820 KB/sec 00:00:25 0 Errors
[root@master ~]# tar -xvf apache-hive-2.3.4-bin
3.8.2 Hive环境配置
设置环境变量。
[root@master ~]# vi /etc/profile
export HIVE_HOME=/opt/hive/apache-hive-2.3.4-bin
export HIVE_CONF_DIR=${HIVE_HOME}/conf
export PATH=$PATH:${JAVA_HOME}/bin:$HADOOP_HOME/sbin:$HADOOP_HOME/bin:${SCALA_HOME}/bin:${SPARK_HOME}/bin:${HIVE_HOME}/bin:${ZK_HOME}/bin:${HBASE_HOME}/bin:$PATH
[root@master ~]# source /etc/profile
建立必需的文件夹。
# 建立root目录下需要的文件夹
[root@master ~]# mkdir /root/hive
[root@master ~]# mkdir /root/hive/warehouse
# hadoop下新建文件目录
[root@master ~]# $HADOOP_HOME/bin/hadoop fs -mkdir -p /root/hive/
[root@master ~]# $HADOOP_HOME/bin/hadoop fs -mkdir -p /root/hive/warehouse
# 为新建文件目录授权读写能力
[root@master ~]# $HADOOP_HOME/bin/hadoop fs -chmod 777 /root/hive/
[root@master ~]# $HADOOP_HOME/bin/hadoop fs -chmod 777 /root/hive/warehouse
# 检验创建状态
[root@master ~]# $HADOOP_HOME/bin/hadoop fs -ls /root/
[root@master ~]# $HADOOP_HOME/bin/hadoop fs -ls /root/hive/
修改hive-site.xml
。
# 创建hive-site.xml
[root@master ~]# cd /opt/hive/apache-hive-2.3.4-bin/conf
[root@master conf]# cp hive-default.xml.template hive-site.xml
[root@master conf]# vi hive-site.xml
添加内容如下:
<!-- 指定HDFS中的hive仓库地址 -->
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/root/hive/warehouse</value>
</property>
<property>
<name>hive.exec.scratchdir</name>
<value>/root/hive</value>
</property>
<!-- 该属性为空表示嵌入模式或本地模式,否则为远程模式 -->
<property>
<name>hive.metastore.uris</name>
<value></value>
</property>
<!-- 指定mysql的连接 -->
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://master:3306/hive?createDatabaseIfNotExist=true</value>
</property>
<!-- 指定驱动类 -->
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<!-- 指定用户名 -->
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
</property>
<!-- 指定密码 -->
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>123456</value>
</property>
<property>
<name>hive.metastore.schema.verification</name>
<value>false</value>
<description>
</description>
</property>
将配置文件中所有的${system:java.io.tmpdir}
更改为 /opt/hive/tmp
(如果没有该文件则创建),并将此文件夹赋予读写权限,将${system:user.name}
更改为 root
。
<!-- 代码较长,需耐心查找,也可借助相关文本替代命令进行处理,仅展示部分 -->
<property>
<name>hive.exec.local.scratchdir</name>
<value>/opt/hive/tmp/root</value>
<description>Local scratch space for Hive jobs</description>
</property>
<property>
<name>hive.downloaded.resources.dir</name>
<value>/opt/hive/tmp/${hive.session.id}_resources</value>
<description>Temporary local directory for added resources in the remote file system.</description>
</property>
修改 hive-env.sh
。
[root@master ~]# cd /opt/hive/apache-hive-2.3.4-bin/conf
[root@master conf]# cp hive-env.sh.template hive-env.sh
[root@master conf]# vi hive-env.sh
# 添加以下内容
export HADOOP_HOME=/opt/hadoop/hadoop-2.7.2
export HIVE_CONF_DIR=/opt/hive/apache-hive-2.3.4-bin/conf
export HIVE_AUX_JARS_PATH=/opt/hive/apache-hive-2.3.4-bin/lib
添加数据驱动包
首先从mysql官网下载对应的JDBC安装包。
[root@master ~]# cd /opt/hive/apache-hive-2.3.4-bin/lib
# 下载mysql 的 jdbc连接驱动jar包,放到hive的lib目录下
[root@master lib]# rz
rz waiting to receive.
Starting zmodem transfer. Press Ctrl+C to cancel.
Transferring mysql-connector-java-5.1.47-bin.jar...
100% 983 KB 983 KB/sec 00:00:01 0 Errors
3.8.2 Hive启动
# 切换到hive文件目录
[root@master lib]# cd /opt/hive/apache-hive-2.3.4-bin/bin
[root@master bin]# pwd
/opt/hive/apache-hive-2.3.4-bin/bin
# 初始化hive
[root@master bin]# schematool -dbType mysql -initSchema
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/hive/apache-hive-2.3.4-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hadoop/hadoop-2.7.2/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Metastore connection URL: jdbc:mysql://master:3306/hive?createDatabaseIfNotExist=true
Metastore Connection Driver : com.mysql.jdbc.Driver
Metastore connection User: root
org.apache.hadoop.hive.metastore.HiveMetaException: Failed to get schema version.
Underlying cause: java.sql.SQLException : Access denied for user 'root'@'master' (using password: YES)
SQL Error code: 1045
Use --verbose for detailed stacktrace.
*** schemaTool failed ***
bug01
发生jar包冲突。
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/hive/apache-hive-2.3.4-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hadoop/hadoop-2.7.2/share/hadoop/common/lib/slf4j-log4j12-
solution
[root@master bin]# rm /opt/hive/apache-hive-2.3.4-bin/lib/log4j-slf4j-impl-2.6.2.jar
rm:是否删除普通文件 "/opt/hive/apache-hive-2.3.4-bin/lib/log4j-slf4j-impl-2.6.2.jar"?y
bug02
用户权限不足。
org.apache.hadoop.hive.metastore.HiveMetaException: Failed to get schema version.
Underlying cause: java.sql.SQLException : Access denied for user ‘root‘@‘master‘ (using password: YES)
SQL Error code: 1045
solution
[root@master bin]#mysql -u root -p
Enter password:
Welcome to the MySQL monitor. Commands end with ; or g.
Your MySQL connection id is 5
Server version: 5.6.42 MySQL Community Server (GPL)
Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type ‘help;‘ or ‘h‘ for help. Type ‘c‘ to clear the current input statement.
mysql>use mysql;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql>select host,user from user;
+-----------+------+
| host | user |
+-----------+------+
| % | root |
| 127.0.0.1 | root |
| ::1 | root |
| localhost | |
| master | |
| master | root |
+-----------+------+
6 rows in set (0.00 sec)
mysql>show grants for root;
ERROR 1290 (HY000): The MySQL server is running with the --skip-grant-tables option so it cannot execute this statement
mysql> flush privileges;
Query OK, 0 rows affected (0.06 sec)
mysql>show grants for root;
+--------------------------------------------------------------------------------------------------------------------------------+
| Grants for root@% |
+--------------------------------------------------------------------------------------------------------------------------------+
| GRANT ALL PRIVILEGES ON . TO ‘root‘@‘%‘ IDENTIFIED BY PASSWORD ‘*6BB4837EB74329105EE4568DDA7DC67ED2CA2AD9‘ WITH GRANT OPTION |
+--------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
mysql>grant all privileges on *.* to ‘root‘@‘%‘ identified by ‘123456‘ with grant option;
Query OK, 0 rows affected (0.01 sec)
mysql>flush privileges;
Query OK, 0 rows affected (0.01 sec)
以上是关于Spark集群框架搭建VM15+CentOS7+Hadoop+Scala+Spark+Zookeeper+HBase+Hive的主要内容,如果未能解决你的问题,请参考以下文章
Spark集群搭建记录 | 云计算[CentOS7] | Scala Maven项目访问Spark实现单词计数