Hadoop 数据仓库工具——Hive

Posted vettel0329

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Hadoop 数据仓库工具——Hive相关的知识,希望对你有一定的参考价值。

1.安装mysql 

  a.在官网下载 Mysql 8.0 (mysql-8.0.16-winx64.zip)并解压,地址:https://dev.mysql.com/downloads/mysql/

  b.在 Mysql 根目录下 my.ini 文件和 data 文件夹,my.ini 内容如下:

[mysqld]
# 设置3306端口
port=3306
# 设置mysql的安装目录
basedir=D:\\Tools\\mysql-8.0.16-winx64
# 设置mysql数据库的数据的存放目录
datadir=D:\\Tools\\mysql-8.0.16-winx64\\data
# 允许最大连接数
max_connections=200
# 允许连接失败的次数。这是为了防止有人从该主机试图攻击数据库系统
max_connect_errors=10
# 服务端使用的字符集默认为UTF8
character-set-server=utf8
# 创建新表时将使用的默认存储引擎
default-storage-engine=INNODB
[mysql]
# 设置mysql客户端默认字符集
default-character-set=utf8
[client]
# 设置mysql客户端连接服务端时默认使用的端口
port=3306
default-character-set=utf8

 

  c.新增系统环境变量 MYSQL_HOMED:\\Tools\\mysql-8.0.16-winx64,并在 Path 变量中添加 %MYSQL_HOME%\\bin

  d.以管理员的身份打开cmd窗口,并跳转到 Mysql 的 bin 目录下

    ①执行初始化命令:mysqld --initialize --user=mysql --console,并记住临时密码

    ②执行安装服务命令:mysqld -install

    ③执行启动服务命令:net start mysql

    ④执行修改密码命令:mysql -u root -p  (此时需要输入①中的临时密码)

    ⑤执行修改密码语句:ALTER USER [email protected] IDENTIFIED  BY ‘123456‘;

技术图片

 

 

2.安装Hive

  a.在官网下载 Hive(apache-hive-2.3.5-bin.tar.gz)并解压,地址:http://mirror.bit.edu.cn/apache/hive/

    注意:Hive版本不能过高,Hadoop 3.1.2 使用 Hive 3.1.1 执行 select 语句会因为jar包冲突报错

  b.新增系统环境变量 HIVE_HOMED:\\Tools\\apache-hive-2.3.5-bin,并在 Path 变量中添加 %HIVE_HOME%\\bin

  c.在 Hive 的 conf 目录下创建 hive-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!--
    Licensed to the Apache Software Foundation (ASF) under one
    or more contributor license agreements.  See the NOTICE file
    distributed with this work for additional information
    regarding copyright ownership.  The ASF licenses this file
    to you under the Apache License, Version 2.0 (the
    "License"); you may not use this file except in compliance
    with the License.  You may obtain a copy of the License at

        http://www.apache.org/licenses/LICENSE-2.0

    Unless required by applicable law or agreed to in writing,
    software distributed under the License is distributed on an
    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    KIND, either express or implied.  See the License for the
    specific language governing permissions and limitations
    under the License.
-->

<configuration>

 <property>
    <name>hive.metastore.warehouse.dir</name>
    <value>/user/hive/warehouse</value>
    <description>location of default database for the warehouse</description>
  </property>
  <property>
    <name>hive.exec.scratchdir</name>
    <value>/tmp/hive</value>
    <description>HDFS root scratch dir for Hive jobs which gets created with write all (733) permission. For each connecting user, an HDFS scratch dir: $hive.exec.scratchdir/&lt;username&gt; is created, with $hive.scratch.dir.permission.</description>
  </property>
    <property>
    <name>hive.exec.local.scratchdir</name>    
    <value>D:/Tools/apache-hive-2.3.5-bin/scratch_dir</value>
    <description>Local scratch space for Hive jobs</description>
  </property>
  <property>
    <name>hive.downloaded.resources.dir</name>    
    <value>D:/Tools/apache-hive-2.3.5-bin/resources_dir/$hive.session.id_resources</value>    
    <description>Temporary local directory for added resources in the remote file system.</description>
  </property>
  <property>
    <name>hive.querylog.location</name>
    <value>D:/Tools/apache-hive-2.3.5-bin/querylog_dir</value>
    <description>Location of Hive run time structured log file</description>
  </property>
  <property>
    <name>hive.server2.logging.operation.log.location</name>
    <value>D:/Tools/apache-hive-2.3.5-bin/operation_dir</value>
    <description>Top level directory where operation logs are stored if logging functionality is enabled</description>
  </property>
  <property>
    <name>javax.jdo.option.ConnectionURL</name>
    <value>jdbc:mysql://127.0.0.1:3306/hive?serverTimezone=UTC&amp;createDatabaseIfNotExist=true</value>
    <description>
      JDBC connect string for a JDBC metastore.
      To use SSL to encrypt/authenticate the connection, provide database-specific SSL flag in the connection URL.
      For example, jdbc:postgresql://myhost/db?ssl=true for postgres database.
    </description>
  </property>
  <property>
    <name>javax.jdo.option.ConnectionDriverName</name>
    <value>com.mysql.jdbc.Driver</value>
    <description>Driver class name for a JDBC metastore</description>
  </property>
  <property>
    <name>javax.jdo.option.ConnectionUserName</name>
    <value>root</value>
    <description>Username to use against metastore database</description>
  </property>
   <property>
    <name>javax.jdo.option.ConnectionPassword</name>
    <value>123456</value>
    <description>password to use against metastore database</description>
  </property>
  <property>
    <name>hive.metastore.schema.verification</name>
    <value>false</value>
    <description>
      Enforce metastore schema version consistency.
      True: Verify that version information stored in is compatible with one from Hive jars.  Also disable automatic
            schema migration attempt. Users are required to manually migrate schema after Hive upgrade which ensures
            proper metastore schema migration. (Default)
      False: Warn if the version information stored in metastore doesn‘t match with one from in Hive jars.
    </description>
  </property>
  <!--配置用户名和密码-->
  <property>
    <name>hive.jdbc_passwd.auth.zhangweijin</name>
    <value>123456</value>
  </property>

</configuration>

 

  d.在 Hive 的根目录下创建 scratch_dir、resources_dir、querylog_dir、operation_dir 四个文件夹

  e.添加对windows的支持:在官网下载 apache-hive-1.2.2-src.tar.gz,解压后将 bin 目录及子目录下的 cmd 文件复制到 Hive 对应的 bin 目录及子目录下

  f.下载 mysql-connector-java-8.0.16.jar 驱动包,放到 Hive 的 lib 目录下

  g.新建cmd窗口进入 Hive 的 bin 目录执行命令初始化数据库表:hive --service schematool -dbType mysql -initSchema

  h.在cmd窗口启动metastore:hive --service metastore

 

参考文章:https://www.cnblogs.com/tangyb/p/8971658.html

 

以上是关于Hadoop 数据仓库工具——Hive的主要内容,如果未能解决你的问题,请参考以下文章

HIVE---基于Hadoop的数据仓库工具讲解

(第7篇)灵活易用易维护的hadoop数据仓库工具——Hive

大数据——Hive(数据仓库工具)

Hadoop 数据仓库工具——Hive

基于Hadoop的数据仓库Hive

hadoop--hive数据仓库