Hive 横向视图爆炸内部机制

Posted

技术标签:

【中文标题】Hive 横向视图爆炸内部机制【英文标题】:Hive Lateral View Explode Internal Mechanism 【发布时间】:2017-05-14 07:03:27 【问题描述】:

我在单个表(大小约为 12GB)的单个查询中多次使用横向视图分解(大约 9 次)。这产生了大量的地图侧数据(100Pb+)。我无法理解它是如何从 12GB 中生成这么多数据的。

有人能解释一下横向爆炸的工作原理吗(内部)?

提前致谢

【问题讨论】:

【参考方案1】:

演示

create table mytable (a1 array<int>,a2 array<int>,a3 array<int>);
insert into mytable select array(1,2),array(3,4,5),array(6,7,8,9);

select  *

from    mytable
        lateral view explode (a1) e1 as a1_val
        lateral view explode (a2) e2 as a2_val
        lateral view explode (a3) e3 as a3_val
;        

+-------+---------+-----------+--------+--------+--------+
|  a1   |   a2    |    a3     | a1_val | a2_val | a3_val |
+-------+---------+-----------+--------+--------+--------+
| [1,2] | [3,4,5] | [6,7,8,9] |      1 |      3 |      6 |
| [1,2] | [3,4,5] | [6,7,8,9] |      1 |      3 |      7 |
| [1,2] | [3,4,5] | [6,7,8,9] |      1 |      3 |      8 |
| [1,2] | [3,4,5] | [6,7,8,9] |      1 |      3 |      9 |
| [1,2] | [3,4,5] | [6,7,8,9] |      1 |      4 |      6 |
| [1,2] | [3,4,5] | [6,7,8,9] |      1 |      4 |      7 |
| [1,2] | [3,4,5] | [6,7,8,9] |      1 |      4 |      8 |
| [1,2] | [3,4,5] | [6,7,8,9] |      1 |      4 |      9 |
| [1,2] | [3,4,5] | [6,7,8,9] |      1 |      5 |      6 |
| [1,2] | [3,4,5] | [6,7,8,9] |      1 |      5 |      7 |
| [1,2] | [3,4,5] | [6,7,8,9] |      1 |      5 |      8 |
| [1,2] | [3,4,5] | [6,7,8,9] |      1 |      5 |      9 |
| [1,2] | [3,4,5] | [6,7,8,9] |      2 |      3 |      6 |
| [1,2] | [3,4,5] | [6,7,8,9] |      2 |      3 |      7 |
| [1,2] | [3,4,5] | [6,7,8,9] |      2 |      3 |      8 |
| [1,2] | [3,4,5] | [6,7,8,9] |      2 |      3 |      9 |
| [1,2] | [3,4,5] | [6,7,8,9] |      2 |      4 |      6 |
| [1,2] | [3,4,5] | [6,7,8,9] |      2 |      4 |      7 |
| [1,2] | [3,4,5] | [6,7,8,9] |      2 |      4 |      8 |
| [1,2] | [3,4,5] | [6,7,8,9] |      2 |      4 |      9 |
| [1,2] | [3,4,5] | [6,7,8,9] |      2 |      5 |      6 |
| [1,2] | [3,4,5] | [6,7,8,9] |      2 |      5 |      7 |
| [1,2] | [3,4,5] | [6,7,8,9] |      2 |      5 |      8 |
| [1,2] | [3,4,5] | [6,7,8,9] |      2 |      5 |      9 |
+-------+---------+-----------+--------+--------+--------+    

【讨论】:

您好 Dudu,感谢您的回答。我想知道横向视图爆炸的内部机制以及Hive内部是如何执行查询的。 内部问题与您的问题无关。数组爆炸会为每个元素生成一条记录。多个数组爆炸相当于product join。 对不起嘟嘟,我忘了。谢谢。

以上是关于Hive 横向视图爆炸内部机制的主要内容,如果未能解决你的问题,请参考以下文章

规范化 Hive 中的横向爆炸

zookeeper 内部机制学习

UIStackView使用介绍

Vue内部运行机制解析

横向视图快速爆炸

Window的内部机制