impala查询优化和explain学习

Posted cclovezbf

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了impala查询优化和explain学习相关的知识,希望对你有一定的参考价值。

是因为看到了这篇文章,想自己再继续研究下。

Impala Join策略与执行计划生成作为铺垫,本文首先对Broadcast Join和Partitioned Join进行简要介绍。 Broadcast Join 顾名思义,Broadcast Join就是广播的方式进行Join。以下图为例,假设Join操作为SELECT A JOIN B ON A.idB.id,Broadcast Join就是将B表&#…https://betheme.net/a/5260296.html?action=onClickimpala版本 3.4.0

官网地址

https://impala.apache.org/docs/build/impala-3.4.pdfhttps://impala.apache.org/docs/build/impala-3.4.pdf 

The EXPLAIN output reminds you if table or column statistics are missing from any table involved in the query. These statistics are important for optimizing queries involving large tables or multi-table joins. See COMPUTE STATS Statement on page 226 for how to gather statistics, and Table and Column Statistics on page 636 for how to use this information for query tuning

翻译:  COMPUTE STATS 对于查询优化非常重要!!。

那就学习 compute stats

COMPUTE STATS [db_name.]table_name [ ( column_list ) ] [TABLESAMPLE SYSTEM(percentage) [REPEATABLE(seed)]] column_list ::= column_name [ , column_name, ... ]

COMPUTE INCREMENTAL STATS [db_name.]table_name [PARTITION (partition_spec)] partition_spec ::= simple_partition_spec | complex_partition_spec simple_partition_spec ::= partition_col=constant_value complex_partition_spec ::= comparison_expression_on_partition_col

这里多说点为什么这种统计信息有好处,因为提前知道了你有多少行,数据量大概有多少,,join的列最大值null值distinc等一系列信息,对于后面是broadcast还是 partition join 就省略了步骤

说下hive也可以分析,这些数据都是存储在mysql的。但是impala好像不能用hive分析的分区数据。

impala常见的用法

compute stats  odserpjdata_kd.gl_balances 

compute INCREMENTAL  stats  odserpjdata_kd.gl_balances 

drop stats xxxx

show table stats odserpjdata_kd.gl_balances 

show column stats odserpjdata_kd.gl_balances 

接着学习explain

Read the EXPLAIN plan from bottom to top:

• The last part of the plan shows the low-level details such as the expected amount of data that will be read, where you can judge the effectiveness of your partitioning strategy and estimate how long it will take to scan a table based on total data size and the size of the cluster.

• As you work your way up, next you see the operations that will be parallelized and performed on each Impala node.

• At the higher levels, you see how data flows when intermediate result sets are combined and transmitted from one node to another.

• See EXPLAIN_LEVEL Query Option on page 346 for details about the EXPLAIN_LEVEL query option, which lets you customize how much detail to show in the EXPLAIN plan depending on whether you are doing high-level or low-level tuning, dealing with logical or physical aspects of the query 

我们explain的时候,要从下往上去看

例如下面的explain。可以看到00和02 04 属于同一级别 01和03属于同一级别

所以我们先看 01 扫描gcc表,然后03 把该表广播出去 然后看00扫描gb表,再看02 此时两个开始inner join 再看04 

注意文章里提到一个set 属性 就是explain level 我们显示的计划级别 默认是1 

Default: 1
Arguments:
The allowed range of numeric values for this option is 0 to 3:
 0 or MINIMAL: A barebones list, one line per operation. Primarily useful for checking the join order in very long
queries where the regular EXPLAIN output is too long to read easily.
 1 or STANDARD: The default level of detail, showing the logical way that work is split up for the distributed
query.
 2 or EXTENDED: Includes additional detail about how the query planner uses statistics in its decision-making
process, to understand how a query could be tuned by gathering statistics, using query hints, adding or removing
predicates, and so on. In Impala 3.2 and higher, the output also includes the analyzed query with the cast
information in the output header, and the implicit cast info in the Predicate section.
 3 or VERBOSE: The maximum level of detail, showing how work is split up within each node into “query
fragments” that are connected in a pipeline. This extra detail is primarily useful for low-level performance testing
and tuning within Impala itself, rather than for rewriting the SQL code at the user level

 个人试了下 1就够用了。2 3 很详细 也很复杂,不多说了。 

直接来实战吧。

select count(1) from odserpjdata_kd.gl_code_combinations -- 11005694  1000w

select count(1) from odserpjdata_kd.gl_balances --107250223  1E

 我们开始explain,两个表inner join 没啥好说的

explain
select * 
from odserpjdata_kd.gl_balances  gb ,
odserpjdata_kd.gl_code_combinations  gcc 
where gb.code_combination_id =gcc.code_combination_id 

-- 这两个表join count(1) =3297320

Max Per-Host Resource Reservation: Memory=35.00MB Threads=5
Per-Host Resource Estimates: Memory=2.18GB
WARNING: The following tables are missing relevant table and/or column statistics.
odserpjdata_kd.gl_balances, odserpjdata_kd.gl_code_combinations

PLAN-ROOT SINK
|
04:EXCHANGE [UNPARTITIONED]
|
02:HASH JOIN [INNER JOIN, BROADCAST]
|  hash predicates: gb.code_combination_id = gcc.code_combination_id
|  runtime filters: RF000 <- gcc.code_combination_id, RF001 <- gcc.code_combination_id
|  row-size=2.42KB cardinality=unavailable
|
|--03:EXCHANGE [BROADCAST]
|  |
|  01:SCAN KUDU [odserpjdata_kd.gl_code_combinations gcc]
|     row-size=1.80KB cardinality=unavailable
|
00:SCAN KUDU [odserpjdata_kd.gl_balances gb]
   runtime filters: RF000 -> gb.code_combination_id, RF001 -> gb.code_combination_id
   row-size=640B cardinality=unavailable

 

前面我也解释过了,先是两个scan kudu表,然后 其中一个broadcast广播到各个节点,最后合并数据,问题来了。我这个表一个1千万 一个一个E,join后的结果有3000w,我就想问哪个表是大表,哪个表是小表??这也能broadcast?

但是聪明的我已经发现这个warning了。就是没有统计表的信息

compute  stats  odserpjdata_kd.gl_balances 
compute  stats  odserpjdata_kd.gl_code_combinations

继续explain

Max Per-Host Resource Reservation: Memory=42.00MB Threads=6
Per-Host Resource Estimates: Memory=2.47GB

PLAN-ROOT SINK
|
05:EXCHANGE [UNPARTITIONED]
|
02:HASH JOIN [INNER JOIN, PARTITIONED]
|  hash predicates: gb.code_combination_id = gcc.code_combination_id
|  runtime filters: RF000 <- gcc.code_combination_id, RF001 <- gcc.code_combination_id
|  row-size=2.67KB cardinality=107.25M
|
|--04:EXCHANGE [HASH(gcc.code_combination_id)]
|  |
|  01:SCAN KUDU [odserpjdata_kd.gl_code_combinations gcc]
|     row-size=1.95KB cardinality=11.01M
|
03:EXCHANGE [HASH(gb.code_combination_id)]
|
00:SCAN KUDU [odserpjdata_kd.gl_balances gb]
   runtime filters: RF000 -> gb.code_combination_id, RF001 -> gb.code_combination_id
   row-size=745B cardinality=107.25M

突然发现就是分区join了。 

其中cardinality(基数) 也有值了就是我compute

而且多了一步exchange。

突然想到一般广播都有大小限制的,那么impala的限制是多少呢? 

BROADCAST_BYTES_LIMIT    34359738368    ADVANCED

没有看错 是32G!所以前面没有统计的时候能够广播。

怎么说呢?两钟都能查出数据,但是谁更好一点呢? 我到cm的控制页面看到如下。

broadcast(drop stats)=3.9 GiB   partition(compute stats)=674.8M,哪个好一目了然。

 

 又看到了一个参数 

DEFAULT_JOIN_DISTRIBUTION_MODE    BROADCAST    ADVANCED

默认就是广播

DEFAULT_JOIN_DISTRIBUTION_MODE Query Option

DEFAULT_JOIN_DISTRIBUTION_MODE Query Option

This option determines the join distribution that Impala uses when any of the tables involved in a join query is missing statistics.

Impala optimizes join queries based on the presence of table statistics, which are produced by the Impala COMPUTE STATS statement. By default, when a table involved in the join query does not have statistics, Impala uses the "broadcast" technique that transmits the entire contents of the table to all executor nodes participating in the query. If one table involved in a join has statistics and the other does not, the table without statistics is broadcast. If both tables are missing statistics, the table that is referenced second in the join order is broadcast. This behavior is appropriate when the table involved is relatively small, but can lead to excessive network, memory, and CPU overhead if the table being broadcast is large.

Because Impala queries frequently involve very large tables, and suboptimal joins for such tables could result in spilling or out-of-memory errors, the setting DEFAULT_JOIN_DISTRIBUTION_MODE=SHUFFLE lets you override the default behavior. The shuffle join mechanism divides the corresponding rows of each table involved in a join query using a hashing algorithm, and transmits subsets of the rows to other nodes for processing. Typically, this kind of join is more efficient for joins between large tables of similar size.

The setting DEFAULT_JOIN_DISTRIBUTION_MODE=SHUFFLE is recommended when setting up and deploying new clusters, because it is less likely to result in serious consequences such as spilling or out-of-memory errors if the query plan is based on incomplete information. This setting is not the default, to avoid changing the performance characteristics of join queries for clusters that are already tuned for their existing workloads.

 

翻译下。 

当连接查询中有个表缺少了统计信息的时候就采用的默认join方式(和我们上面的例子一样,最开始没有compute 结果就是默认的broadcast)

如果联接中涉及的一个表具有统计信息,而另一个没有,则广播不具有统计信息的表。

如果两个表都缺少统计信息,则广播联接顺序中第二个引用的表。当涉及的表相对较小时,这种行为是合适的,但如果正在广播的表较大,则可能导致过多的网络、内存和CPU开销。

select 1 from A,B where A.id=B.id 广播右边的B

select 1 from B,A where A.id=B.id 广播右边的A

select 1 from B,A where A.id=B.id + compute stats A 广播B(因为B没有统计信息)

一般来说我没见过其他人compute,所以查询的时候注意了大表放前面小表放后面。

因为impala一般表都比较大 动则千万上亿,所以很容易内存溢出,此处你就要注意了。 

DEFAULT_JOIN_DISTRIBUTION_MODE=SHUFFLE  这个就是我们之前说的分区join,这样耗费的内存小点。但是怎么说呢 如果是大表join小表我觉得还是broadcast好点。官方也这么建议。

总结下我们学习到了什么?

1. compute stats 语法

2.explain 用法和查看

3.broadcast和shuffle的区别,

4.join的内存溢出如何解决?(shuffle)

5.set 参数的设置BROADCAST_BYTES_LIMIT DEFAULT_JOIN_DISTRIBUTION_MODE

MySQL查询语句过程和EXPLAIN语句的基本概念及其优化

今日,要跟大家介绍的是MySQL查询语句过程和explain语句的基本概念及优化,小杜觉得很实用,因此,整理了一下并分享给大家做个参考,由于内容简单清晰,易于学习和理解,感兴趣的朋友不妨来看看!

网站或服务的性能关键点很大程度在于数据库的设计(假设你选择了合适的语言开发框架)以及如何查询数据上。

我们知道MySQL的性能优化方法,一般有建立索引、规避复杂联合查询、设置冗余字段、建立中间表、查询缓存等,也知道用EXPLAIN来查看执行计划。

但对MySQL复杂查询语句执行过程和内部机制,MySQL Optimizer本身所做优化以及查询语句调整对性能所产生的影响及其原因知之甚少。

本文试图对其中的一些关键概念如执行过程、索引使用等做比较深入的探讨,知其然,知其所以然,

这样可以避免在原本通过MySQL简单优化就能获得很好效果的情况下,盲目跟风转向NoSQL存储或者投入资金升级基础设施。

工欲善其事,必先利其器,这里首先介绍MySQL查询语句性能分析工具。

MySQL的EXPLAIN命令是用来分析查询性能的工具,EXPLAIN的输出每一行对应于查询语句中的一张表的执行计划说明,其输出列含义如下表:

上表中type列是表关联类型,常见的有如下类型(按关联查询效率从高到低排列):

const(常量连接),比如SELECT * FROM user WHERE id=1;
eq_ref(等值引用),比如SELECT * FROM user,card WHERE user.id=card.userid;
ref(引用),用于非唯一索引,比如SELECT * FROM user,card WHERE user.last_name=’test’;
range(范围),比如SELECT * FROM tbl_name WHERE key_column > 10;
index(索引),根据索引来读取数据,如果索引已包含了查询数据,只需扫描索引树,否则执行全表扫描和All类似;
ALL(所有),全表扫描

key列代表索引(index),rows表示估计会扫描多少行记录,

Extra表示附加信息,常见的有如下几种(也按查询效率从高到低排列):

Using index:表示使用索引,如果同时出现Using where,代表使用索引来查找读取记录,如果没有Using where,表示索引包含查询数据,无需额外的查找;
Using where:表示条件查询,如果type列是ALL或index,而没有出现该信息,则你有可能在执行错误的查询:返回所有数据;
Using filesort:不是“使用文件索引”的含义!filesort是MySQL所实现的一种排序策略,通常在使用到排序语句ORDER BY的时候,会出现该信息;
Using temporary:表示为了得到结果,使用了临时表,这通常是出现在多表联合查询,结果排序的场合;

如果EXPLAIN出现后面两个信息(Using filesort,Using temporary),而rows又比较大,通常意味着你需要调整查询语句,或者需要添加索引,总之需要尽量消除这两个信息。

下面给出EXPLAIN结果实例(从用户档案表中找出昵称和性别,按用户表中的用户关注者数量排序):

到此,关于“MySQL查询语句过程和EXPLAIN语句的基本概念及其优化”的内容就介绍完毕,相信大家看完之后有更深的了解,如果大家还想学习更多相关知识,可以关注摩杜云行业资讯频道,更多相关知识等着你来学习,感谢各位的阅读!

以上是关于impala查询优化和explain学习的主要内容,如果未能解决你的问题,请参考以下文章

Parquet 支持的 Hive 表:Impala 中不可查询的数组列

Hive优化

jdbc连接impala

MySQL查询语句过程和EXPLAIN语句的基本概念及其优化

Explain 执行计划 和 SQL优化

PostgreSQL学习系列—EXPLAIN ANALYZE查询计划解读