Hadoop 数据仓库工具——Hive
Posted vettel0329
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Hadoop 数据仓库工具——Hive相关的知识,希望对你有一定的参考价值。
1.安装mysql
a.在官网下载 Mysql 8.0 (mysql-8.0.16-winx64.zip)并解压,地址:https://dev.mysql.com/downloads/mysql/
b.在 Mysql 根目录下 my.ini 文件和 data 文件夹,my.ini 内容如下:
[mysqld] # 设置3306端口 port=3306 # 设置mysql的安装目录 basedir=D:\\Tools\\mysql-8.0.16-winx64 # 设置mysql数据库的数据的存放目录 datadir=D:\\Tools\\mysql-8.0.16-winx64\\data # 允许最大连接数 max_connections=200 # 允许连接失败的次数。这是为了防止有人从该主机试图攻击数据库系统 max_connect_errors=10 # 服务端使用的字符集默认为UTF8 character-set-server=utf8 # 创建新表时将使用的默认存储引擎 default-storage-engine=INNODB [mysql] # 设置mysql客户端默认字符集 default-character-set=utf8 [client] # 设置mysql客户端连接服务端时默认使用的端口 port=3306 default-character-set=utf8
c.新增系统环境变量 MYSQL_HOME:D:\\Tools\\mysql-8.0.16-winx64,并在 Path 变量中添加 %MYSQL_HOME%\\bin
d.以管理员的身份打开cmd窗口,并跳转到 Mysql 的 bin 目录下
①执行初始化命令:mysqld --initialize --user=mysql --console,并记住临时密码
②执行安装服务命令:mysqld -install
③执行启动服务命令:net start mysql
④执行修改密码命令:mysql -u root -p (此时需要输入①中的临时密码)
⑤执行修改密码语句:ALTER USER [email protected] IDENTIFIED BY ‘123456‘;
2.安装Hive
a.在官网下载 Hive(apache-hive-2.3.5-bin.tar.gz)并解压,地址:http://mirror.bit.edu.cn/apache/hive/
注意:Hive版本不能过高,Hadoop 3.1.2 使用 Hive 3.1.1 执行 select 语句会因为jar包冲突报错
b.新增系统环境变量 HIVE_HOME:D:\\Tools\\apache-hive-2.3.5-bin,并在 Path 变量中添加 %HIVE_HOME%\\bin
c.在 Hive 的 conf 目录下创建 hive-site.xml
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to you under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. --> <configuration> <property> <name>hive.metastore.warehouse.dir</name> <value>/user/hive/warehouse</value> <description>location of default database for the warehouse</description> </property> <property> <name>hive.exec.scratchdir</name> <value>/tmp/hive</value> <description>HDFS root scratch dir for Hive jobs which gets created with write all (733) permission. For each connecting user, an HDFS scratch dir: $hive.exec.scratchdir/<username> is created, with $hive.scratch.dir.permission.</description> </property> <property> <name>hive.exec.local.scratchdir</name> <value>D:/Tools/apache-hive-2.3.5-bin/scratch_dir</value> <description>Local scratch space for Hive jobs</description> </property> <property> <name>hive.downloaded.resources.dir</name> <value>D:/Tools/apache-hive-2.3.5-bin/resources_dir/$hive.session.id_resources</value> <description>Temporary local directory for added resources in the remote file system.</description> </property> <property> <name>hive.querylog.location</name> <value>D:/Tools/apache-hive-2.3.5-bin/querylog_dir</value> <description>Location of Hive run time structured log file</description> </property> <property> <name>hive.server2.logging.operation.log.location</name> <value>D:/Tools/apache-hive-2.3.5-bin/operation_dir</value> <description>Top level directory where operation logs are stored if logging functionality is enabled</description> </property> <property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:mysql://127.0.0.1:3306/hive?serverTimezone=UTC&createDatabaseIfNotExist=true</value> <description> JDBC connect string for a JDBC metastore. To use SSL to encrypt/authenticate the connection, provide database-specific SSL flag in the connection URL. For example, jdbc:postgresql://myhost/db?ssl=true for postgres database. </description> </property> <property> <name>javax.jdo.option.ConnectionDriverName</name> <value>com.mysql.jdbc.Driver</value> <description>Driver class name for a JDBC metastore</description> </property> <property> <name>javax.jdo.option.ConnectionUserName</name> <value>root</value> <description>Username to use against metastore database</description> </property> <property> <name>javax.jdo.option.ConnectionPassword</name> <value>123456</value> <description>password to use against metastore database</description> </property> <property> <name>hive.metastore.schema.verification</name> <value>false</value> <description> Enforce metastore schema version consistency. True: Verify that version information stored in is compatible with one from Hive jars. Also disable automatic schema migration attempt. Users are required to manually migrate schema after Hive upgrade which ensures proper metastore schema migration. (Default) False: Warn if the version information stored in metastore doesn‘t match with one from in Hive jars. </description> </property> <!--配置用户名和密码--> <property> <name>hive.jdbc_passwd.auth.zhangweijin</name> <value>123456</value> </property> </configuration>
d.在 Hive 的根目录下创建 scratch_dir、resources_dir、querylog_dir、operation_dir 四个文件夹
e.添加对windows的支持:在官网下载 apache-hive-1.2.2-src.tar.gz,解压后将 bin 目录及子目录下的 cmd 文件复制到 Hive 对应的 bin 目录及子目录下
f.下载 mysql-connector-java-8.0.16.jar 驱动包,放到 Hive 的 lib 目录下
g.新建cmd窗口进入 Hive 的 bin 目录执行命令初始化数据库表:hive --service schematool -dbType mysql -initSchema
h.在cmd窗口启动metastore:hive --service metastore
参考文章:https://www.cnblogs.com/tangyb/p/8971658.html
以上是关于Hadoop 数据仓库工具——Hive的主要内容,如果未能解决你的问题,请参考以下文章