Inspiration from Apache HAWQ
Posted 张包峰
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Inspiration from Apache HAWQ相关的知识,希望对你有一定的参考价值。
Aspects
Interconnect
UDP(User Datagram Protocol) , additional packet verification, the reliability is equivalent to TCP(Transmission Control Protocol), and the performance and scalability exceeds that of TCP
Execution Runtime
dynamic DOP(degree of parallelism) on each Segment
small point: consume splittable data file concurrently
virtual segment allocation policy, several aspects to decide the allocation per query
Maybe we pay too much attention on master-side scheduling and slave-side parallelism for each query.
There is another way, that each slave decides its own the DOP. Task is not forced to launch or setup something for execution decided by master once running. Slave should maintain its own resources(containers) to assign its allocation dynamically for query.
Resource Management
fine-grained resource management above YARN.
neccessary for online system, especially OLAP.
Query Processing
motion occurs when tuples need moving between segments.
slice occurs when the query occurs motion.
To do hash join, a Redistribue Motion involved, one table is redistributed given the join key.
Typical problem solved as a MPP system. DAG also can solve this problem for OLAP, but this is the key distinguish between MPP and DAG.
The approach behind this is largely different in MPP and DAG. But to be honest, it is not that different, maybe a hybrid runtime can cover these two.
GPORCA
sth.
Runaway Query Termination
neccessary for online system, especially OLAP.
以上是关于Inspiration from Apache HAWQ的主要内容,如果未能解决你的问题,请参考以下文章