Hive - 使用两个现有 Hive 表创建视图
Posted
技术标签:
【中文标题】Hive - 使用两个现有 Hive 表创建视图【英文标题】:Hive - Create a view using two existing Hive tables 【发布时间】:2015-05-28 19:45:19 【问题描述】:我是 Hive 的初学者。我有两个 Hive 表如下:
表 A 包含列 - date, name, age.
表 A 中日期列的值范围为 20150406 到 20150408。
表 B 是表 A 的副本 - 但又添加了一个新列 - date, name, **dept**, age
表 B 中日期列中的值范围为 20150409 到 20150411。
我想使用表 A 和 B 创建一个视图,这样
View A =
Table A(date, name, dept as NULL, age) //for dates 20150406 to 20150408
UNION
Table B(date, name, dept, age) //for dates 20150409 to 20150411
例子:
表 A
date | name | age
20150406 | John | 21
20150407 | Jane | 23
20150408 | Mary | 20
表 B
date | name | dept | age
20150409 | Claire | CSE | 25
20150410 | Cindy | Maths | 27
20150408 | Tom | Biology | 30
查看A
date | name | dept | age
20150406 | John | NULL | 21
20150407 | Jane | NULL | 23
20150408 | Mary | NULL | 20
20150409 | Claire | CSE | 25
20150410 | Cindy | Maths | 27
20150408 | Tom | Biology | 30
这可行吗?如何做到这一点?
提前致谢!
【问题讨论】:
【参考方案1】:你快到了:
create view viewA
as
select date, name, NULL as dept, age
from tableA
where date between '20150406' and '20150408'
union all
select date, name, dept, age
from tableB
where date between '20150409' and '20150411'
【讨论】:
【参考方案2】:查看详细解决方案:
hive> create table tableA(date String,name string,age int) row format delimited fields terminated by '\t' stored as textfile;
OK
Time taken: 0.084 seconds
hive> create table tableB(date String,name string,dept String,age int) row format delimited fields terminated by '\t' stored as textfile;
OK
Time taken: 0.103 seconds
然后通过加载从本地到配置单元:
hive> load data local inpath '/home/hastimal/PracticeTableData/tableB' into table tableB;
Loading data to table default.tableb
Table default.tableb stats: [numFiles=1, totalSize=71]
OK
Time taken: 0.291 seconds
hive> load data local inpath '/home/hastimal/PracticeTableData/tableA' into table tableA;
Loading data to table default.tablea
Table default.tablea stats: [numFiles=1, totalSize=51]
OK
在 hive 中进一步可用以确保:
hive> select * from tableA;
OK
20150406 John 21
20150407 Jane 23
20150408 Mary 20
Time taken: 0.126 seconds, Fetched: 3 row(s)
hive> select * from tableB;
OK
20150409 Claire CSE 25
20150410 Cindy Maths 27
20150408 Tom Biology 30
Time taken: 0.11 seconds, Fetched: 3 row(s)
最终解决方案:)
SELECT tbA.date AS a ,tbA.name AS b ,NULL AS c,tbA.age AS d FROM tableA tbA
UNION ALL
SELECT tbB.date AS a ,tbB.name AS b ,tbB.dept AS c,tbB.age AS d FROM tableB tbB
查看输出:
OK
20150409 Claire CSE 25
20150410 Cindy Maths 27
20150408 Tom Biology 30
20150406 John NULL 21
20150407 Jane NULL 23
20150408 Mary NULL 20
Time taken: 43.462 seconds, Fetched: 6 row(s)
【讨论】:
@activelearner 对于您的解决方案,您可以这样做: 'hive> create table tableA(date String,name string,age int) 行格式分隔字段以 ' | 终止' 存储为文本文件;' @activelearner 还要在 UNION ALL 中更改 tableB 下面的 tableA,这样您就可以准确地解决您的要求:)以上是关于Hive - 使用两个现有 Hive 表创建视图的主要内容,如果未能解决你的问题,请参考以下文章