hive常用命令

Posted 2022-10-12

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了hive常用命令相关的知识，希望对你有一定的参考价值。

1.创建数据(文本以tab分隔)

vim test1_hive

hive常用命令_分区表

2.创建新表

CREATE TABLE t_hive (a int, b int, c int) ROW FORMAT DELIMITED FIELDS TERMINATED BY |;

3.导入数据test1_hive到t_hive表

LOAD DATA LOCAL INPATH /lost+found/test1_hive OVERWRITE INTO TABLE t_hive ;

hive常用命令_数据_02

4.查看表数据

查看表

show tables;

正则表达式匹配表名

show tables *i*;

查看表数据

select * from t_hive;

hive常用命令_hive_03

查看表结构

desc  t_hive;

hive常用命令_hive_04

5.修改表

增加字段

ALTER TABLE t_hive ADD COLUMNS (d String);

修改表名

ALTER TABLE t_hive RENAME TO t_wang;

hive常用命令_数据_05

6.删除表

drop table t_wang;

7.hive交互式模式

quit,exit:  退出交互式shell
reset: 重置配置为默认值
set <key>=<value> : 修改特定变量的值(如果变量名拼写错误，不会报错)
set:  输出用户覆盖的hive配置变量
set -v : 输出所有Hadoop和Hive的配置变量
add FILE[S] *, add JAR[S] *, add ARCHIVE[S] * : 添加 一个或多个 file, jar, archives到分布式缓存
list FILE[S], list JAR[S], list ARCHIVE[S] : 输出已经添加到分布式缓存的资源
list FILE[S] *, list JAR[S] *,list ARCHIVE[S] * : 检查给定的资源是否添加到分布式缓存
delete FILE[S] *,delete JAR[S] *,delete ARCHIVE[S] * : 从分布式缓存删除指定的资源
! <command> :  从Hive shell执行一个shell命令
dfs <dfs command> :  从Hive shell执行一个dfs命令
<query string> : 执行一个Hive 查询，然后输出结果到标准输出
source FILE <filepath>:  在CLI里执行一个hive脚本文件

8.数据导入

之前已经导入了数据，现在HDFS中查找刚刚导入的数据

hadoop fs -cat /user/hive/warehouse/t_wang/test1_hive

hive常用命令_hive_06

从其他表导入数据

CREATE TABLE t_hive2 (a int, b int, c int) ROW FORMAT DELIMITED FIELDS TERMINATED BY |;     #创建表
INSERT OVERWRITE TABLE t_hive2 SELECT * FROM t_hive;

hive常用命令_hive_07

仅复制表结构不导数据

CREATE TABLE t_hive3 LIKE t_hive;

9.数据导出

从HDFS复制到HDFS其他位置

hadoop fs -cp /user/hive/warehouse/t_hive /

hive常用命令_分区表_08

查看复制

hadoop fs -ls /t_hive

hive常用命令_分区表_09

hadoop fs -cat /t_hive/test1_hive

hive常用命令_hive_10

通过Hive导出到本地文件系统

INSERT OVERWRITE LOCAL DIRECTORY /tmp/t_hive SELECT * FROM t_hive;

hive常用命令_数据_11

查看本地操作系统

! cat /tmp/t_hive/000000_0;

hive常用命令_分区表_12

10.Hive查询HiveQL

普通查询：排序，列别名，嵌套子查询

hive常用命令_hive_13

连接查询：JOIN

hive常用命令_分区表_14

hive常用命令_分区表_15

聚合查询1：count, distinct

hive常用命令_分区表_16

hive常用命令_hive_17

聚合查询2：count, avg

hive常用命令_分区表_18

hive常用命令_数据_19

聚合查询3：GROUP BY, HAVING

hive常用命令_数据_20

hive常用命令_数据_21

hive常用命令_hive_22

hive常用命令_数据_23

11.hive视图

create view v_hive as select a,b from t_hive where c>30;

hive常用命令_分区表_24

删除视图

drop view v_hive;

12.hive分区表

分区表是数据库的基本概念，但很多时候数据量不大，我们完全用不到分区表。Hive是一种OLAP数据仓库软件，涉及的数据量是非常大的，所以分区表在这个场景就显得非常重要。

下面我们重新定义一个数据表结构：t_hft

创建数据

vim /lost+found/t_hft_1

     000001,092023,9.76
     000002,091947,8.99
     000004,092002,9.79
     000005,091514,2.2
     000001,092008,9.70
     000001,092059,9.45

vim /lost+found/t_hft_2

     000001,092023,9.76
     000002,091947,8.99
     000004,092002,9.79
     000005,091514,2.2
     000001,092008,9.70
     000001,092059,9.45

创建数据表

     DROP TABLE IF EXISTS t_hft;  

     CREATE TABLE t_hft(
     SecurityID STRING,
     tradeTime STRING,
     PreClosePx DOUBLE
     ) PARTITIONED BY (tradeDate INT)

    ROW FORMAT DELIMITED FIELDS TERMINATED BY ,;

hive常用命令_hive_25

导入数据

LOAD DATA LOCAL INPATH /lost+found/t_hft_1 OVERWRITE INTO TABLE t_hft PARTITION (tradeDate=20160220);
LOAD DATA LOCAL INPATH /lost+found/t_hft_2 OVERWRITE INTO TABLE t_hft PARTITION (tradeDate=20160221);

hive常用命令_hive_26

查看分区表

SHOW PARTITIONS t_hft;

hive常用命令_hive_27

查询数据

select * from t_hft where securityid=000001;

hive常用命令_分区表_28

select * from t_hft where tradedate=20160220 and preclosepx<9;

hive常用命令_hive_29

13.动态写入分区表

创建分区表

hive常用命令_hive_30

导入数据

hive常用命令_数据_31

批量导入新建的分区表

hive常用命令_数据_32

14.导出表中的数据到文件

export table dept_count to /lost+found/hive_dept_count_o;

hive常用命令_分区表_33

15.不需要打开命令行界面，直接执行完查询

hive -e use test;select * from t_hive;

hive常用命令_数据_34

16.有时候需要执行多个查询，可以通过操作文件来执行

hive常用命令_hive_35

hive -f test.sql

hive常用命令_数据_36

17.hive中直接查看hdfs中的文件

dfs -ls /;

hive常用命令_数据_37

18.查看已经存在的数据库

describe database test;

hive常用命令_hive_38

19.查看表的详细信息及注释

desc formatted student;

hive常用命令_分区表_39

以上是关于hive常用命令的主要内容，如果未能解决你的问题，请参考以下文章