centos下 Hive搭建

Posted 漫步科技人生

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了centos下 Hive搭建相关的知识,希望对你有一定的参考价值。

centos下 Hive搭建(mysql、hadoop 、jdk)

[20210128 longtao.wu]

Mysql

  1. wget https://dev.mysql.com/get/mysql80-community-release-el7-3.noarch.rpm

    1. rpm -ivh mysql80-community-release-el7-3.noarch.rpm

  2. yum install mysql-community-server-y systemctl enable mysqld systemctl start mysqld grep'temporary password'/var/log/mysqld.log#获得密码mysql-u root-p mysql>setglobalvalidate_password.policy=0;mysql>setglobalvalidate_password.length=1;mysql>alter user'root'@'localhost'identifiedby'password';firewall-cmd--zone=public--add-port=3306/tcp--permanent firewall-cmd--reload

完整脚本

 
   
   
 
  1. wget https://dev.mysql.com/get/mysql80-community-release-el7-3.noarch.rpm

  2. rpm -ivh mysql80-community-release-el7-3.noarch.rpm

  3. yum install mysql-community-server -y

  4. systemctl enable mysqld

  5. systemctl start mysqld

  6. grep 'temporary password' /var/log/mysqld.log #获得密码

  7. mysql -u root -p

  8. mysql> set global validate_password.policy=0;

  9. mysql> set global validate_password.length=1;

  10. mysql> alter user 'root'@'localhost' identified by 'password';

  11. firewall-cmd --zone=public --add-port=3306/tcp --permanent

  12. firewall-cmd --reload

hadoop安装

Hadoop的核心由3个部分组成:

HDFS: Hadoop Distributed File System,分布式文件系统,hdfs还可以再细分为NameNode、SecondaryNameNode、DataNode。

YARN: Yet Another Resource Negotiator,资源管理调度系统

Mapreduce:分布式运算框架

1.安装jdk7或8(不支持11)

 
   
   
 
  1. #下载

  2. wget -O jdk-8u131-linux-x64.tar.gz --no-check-certificate --no-cookies --header "Cookie: oraclelicense=accept-securebackup-cookie" http://download.oracle.com/otn-pub/java/jdk/8u131-b11/d54c1d3a095b4ff2b6607d096fa80163/jdk-8u131-linux-x64.tar.gz


  3. tar zxvf ~/jdk-8u131-linux-x64.tar.gz -C /usr/local/


  4. # 配置环境

  5. vi /etc/profile

  6. # 后面添加

  7. export JAVA_HOME=/usr/local/jdk1.8.0_131

  8. export CLASSPATH=$CLASSPATH:$JAVA_HOME/lib/

  9. export PATH=$PATH:$JAVA_HOME/bin


  10. #重新加载配置

  11. source /etc/profile


  12. #验证

  13. java -version

 
   
   
 
  1. #下载hadoop

  2. # wget https://downloads.apache.org/hadoop/common/hadoop-3.2.2/hadoop-3.2.2.tar.gz

  3. wget http://mirror.bit.edu.cn/apache/hadoop/common/hadoop-2.10.1/hadoop-2.10.1.tar.gz

  4. tar zxvf hadoop-2.10.1.tar.gz -C /usr/local/


  5. #配置jdk

  6. vi ${HADOOP_HMOE}/etc/hadoop/hadoop-env.sh

  7. export JAVA_HOME=/usr/local/jdk1.8.0_131/


  8. #设置伪分布模式(Pseudo-Distributed Operation)修改etc/hadoop/core-site.xml文件,增加配置(fs.defaultFS:默认文件系统名称):

  9. vi ${HADOOP_HMOE}/etc/hadoop/core-site.xml

  10. <configuration>

  11. <property>

  12. <name>fs.defaultFS</name>

  13. <value>hdfs://localhost:9000</value>

  14. </property>

  15. </configuration>


  16. # 修改etc/hadoop/hdfs-site.xml文件,增加配置(dfs.replication:文件副本数):

  17. <configuration>

  18. <property>

  19. <name>dfs.replication</name>

  20. <value>1</value>

  21. </property>

  22. </configuration>


  23. #格式化文件系统

  24. /usr/local/hadoop-2.10.1/bin/hdfs namenode -format


  25. #启动NameNode和DataNode进程(启动hdfs)

  26. /usr/local/hadoop-2.10.1/sbin/start-dfs.sh // 启动NameNode和DataNode进程

  27. /usr/local/hadoop-2.10.1/sbin/stop-dfs.sh // 关闭NameNode和DataNode进程


  28. #查看HDFS验证成功

  29. http://172.20.19.33:50070


  30. #启动yarn

  31. /usr/local/hadoop-2.10.1/sbin/start-yarn.sh

  32. /usr/local/hadoop-2.10.1/sbin/stop-yarn.sh


  33. #查看YARN

  34. http://172.20.19.33:8088/


  35. # 配置环境

  36. vi /etc/profile

  37. # 后面添加

  38. export HADOOP_HOME=/usr/local/hadoop-2.10.1

  39. export PATH=$PATH:$HADOOP_HOME/bin

  40. #重新加载配置

  41. source /etc/profile

Hive安装

 
   
   
 
  1. # 下载

  2. wget https://mirrors.tuna.tsinghua.edu.cn/apache/hive/hive-3.1.2/apache-hive-3.1.2-bin.tar.gz -c #比较慢,断点下载

  3. tar zxvf apache-hive-3.1.2-bin.tar.gz -C /usr/local/


  4. #改名去bin

  5. mv apache-hive-3.1.2-bin/ apache-hive-3.1.2/


  6. #配置

  7. cd /usr/local/apache-hive-3.1.2/conf/

  8. cp hive-log4j2.properties.template hive-log4j2.properties

  9. vi /usr/local/apache-hive-3.1.2/conf/hive-site.xml


  10. <?xml version="1.0" encoding="UTF-8" standalone="no"?>

  11. <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

  12. <configuration>

  13. <property>

  14. <name>javax.jdo.option.ConnectionURL</name>

  15. <value>jdbc:mysql://172.20.19.33:3306/hive?createDatabaseIfNotExist=true</value>

  16.  </property>

  17.  <property>

  18. <name>javax.jdo.option.ConnectionDriverName</name>

  19.   <value>com.mysql.cj.jdbc.Driver</value>

  20. </property>

  21. <property>

  22. <name>javax.jdo.option.ConnectionUserName</name>

  23. <value>root</value>

  24. </property>

  25. <property>

  26. <name>javax.jdo.option.ConnectionPassword</name>

  27. <value>password</value>

  28. </property>

  29. </configuration>


  30. #初始化数据库

  31. #下载并拷贝protobuf-java-3.6.1.jar和mysql-connector-java-8.0.17.jar到$HIVE_HOME/lib目录下,删除已有的protobuf-java-2.5.0.jar文件。统一hive和mysql服务的时区。

  32. wget -c https://repo1.maven.org/maven2/com/google/protobuf/protobuf-java/3.6.1/protobuf-java-3.6.1.jar

  33. wget -c https://repo1.maven.org/maven2/mysql/mysql-connector-java/8.0.17/mysql-connector-java-8.0.17.jar

  34. ln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime # hive和mysql服务器都要执行,mysql要重启服务

  35. cd /usr/local/apache-hive-3.1.2/bin/

  36. /usr/local/apache-hive-3.1.2/bin/schematool -dbType mysql -initSchema # 报错需要修改mysql访问权限

  37. /usr/local/apache-hive-3.1.2/bin/hive


  38. # 启动hiveserver2:

  39. # 修改$HADOOP_HOME/etc/hadoop/core-site.xml文件,增加如下配置:

  40. vi /usr/local/hadoop-2.10.1/etc/hadoop/core-site.xml

  41. <property>

  42. <name>hadoop.proxyuser.root.hosts</name>

  43. <value>*</value>

  44. </property>

  45. <property>

  46. <name>hadoop.proxyuser.root.groups</name>

  47. <value>*</value>

  48. </property>

  49. <property>

  50. <name>hadoop.proxyuser.hadoop.hosts</name>

  51. <value>*</value>

  52. </property>

  53. <property>

  54. <name>hadoop.proxyuser.hadoop.groups</name>

  55. <value>*</value>

  56. </property>


  57. vi /usr/local/apache-hive-3.1.2/conf/hive-site.xml


  58. <property>

  59. <name>hive.server2.thrift.port</name>

  60. <value>10000</value>

  61. </property>

  62. <property>

  63. <name>hive.server2.thrift.bind.host</name>

  64. <value>172.20.19.33</value>

  65. </property>


  66. /usr/local/apache-hive-3.1.2/bin/hiveserver2 &

mysql权限修改

 
   
   
 
  1. # 改表法

  2. mysql -u root -pvmware

  3. mysql>use mysql;

  4. mysql>update user set host = '%' where user = 'root';

  5. mysql>flush privileges;

注意

Hive Shell 、Hive Web UI及JDBC 三种连接Hive Server的方式是互斥的,同一时刻只能用一种方式连接

端口说明

组件 Daemon 端口 配置 说明
HDFS DataNode 50010 dfs.datanode.address datanode服务端口,用于数据传输


50075 dfs.datanode.http.address http服务的端口


50475 dfs.datanode.https.address https服务的端口


50020 dfs.datanode.ipc.address ipc服务的端口

NameNode 50070 dfs.namenode.http-address http服务的端口


50470 dfs.namenode.https-address https服务的端口


8020 fs.defaultFS 接收Client连接的RPC端口,用于获取文件系统metadata信息。

journalnode 8485 dfs.journalnode.rpc-address RPC服务


8480 dfs.journalnode.http-address HTTP服务

ZKFC 8019 dfs.ha.zkfc.port ZooKeeper FailoverController,用于NN HA
YARN ResourceManager 8032 yarn.resourcemanager.address RM的applications manager(ASM)端口


8030 yarn.resourcemanager.scheduler.address scheduler组件的IPC端口


8031 yarn.resourcemanager.resource-tracker.address IPC


8033 yarn.resourcemanager.admin.address IPC


8088 yarn.resourcemanager.webapp.address http服务端口

NodeManager 8040 yarn.nodemanager.localizer.address localizer IPC


8042 yarn.nodemanager.webapp.address http服务端口


8041 yarn.nodemanager.address NM中container manager的端口

JobHistory Server 10020 mapreduce.jobhistory.address IPC


19888 mapreduce.jobhistory.webapp.address http服务端口
HBase Master 60000 hbase.master.port IPC


60010 hbase.master.info.port http服务端口

RegionServer 60020 hbase.regionserver.port IPC


60030 hbase.regionserver.info.port http服务端口

HQuorumPeer 2181 hbase.zookeeper.property.clientPort HBase-managed ZK mode,使用独立的ZooKeeper集群则不会启用该端口。


2888 hbase.zookeeper.peerport HBase-managed ZK mode,使用独立的ZooKeeper集群则不会启用该端口。


3888 hbase.zookeeper.leaderport HBase-managed ZK mode,使用独立的ZooKeeper集群则不会启用该端口。
Hive Metastore 9083 /etc/default/hive-metastore中export PORT= 来更新默认端口

HiveServer 10000 /etc/hive/conf/hive-env.sh中export HIVESERVER2THRIFT_PORT= 来更新默认端口
ZooKeeper Server 2181 /etc/zookeeper/conf/zoo.cfg中clientPort= 对客户端提供服务的端口


2888 /etc/zookeeper/conf/zoo.cfg中server.x=[hostname]:nnnnn[:nnnnn],标蓝部分 follower用来连接到leader,只在leader上监听该端口。


3888 /etc/zookeeper/conf/zoo.cfg中server.x=[hostname]:nnnnn[:nnnnn],标蓝部分 用于leader选举的。只在electionAlg是1,2或3(默认)时需要。

www.longtao.fun


以上是关于centos下 Hive搭建的主要内容,如果未能解决你的问题,请参考以下文章

搭建hadoop+spark+hive环境(配置安装hive)

CentOS7搭建Hive1.2.2+mysql5.7

Centos6.5安装部署Hive

基于Centos7.8的Hive安装

CentOS 7.6 搭建Gitlab教程

docker搭建hadoop和hive集群