Hive 查询使用过多的 reducer 运行

Posted 2023-04-14

技术标签:

【中文标题】Hive 查询使用过多的 reducer 运行【英文标题】：Hive Queries are running with too many reducers 【发布时间】：2015-06-08 12:59:05 【问题描述】：

最近我们从 Hadoop 2.0.0-cdh4.2.1 升级到了 Hadoop 2.6.0-cdh5.4.2。现在我们使用 Hive 1.1.0-cdh5.4.2。

当我运行一个简单的 hive 查询时，它使用了太多的 reducer，在以前的版本中它需要 120 个 reducer，而在新版本中它需要 1100 个 reducer。

谁能告诉我为什么会这样？

提前致谢。

【问题讨论】：

请添加，您正在执行的查询。查询：select id1, day, seq, count(1) from table_name where 1=1 and concat(day,hour)>='2015-05-3004' and concat(day,hour) 【参考方案1】：

reducer 的数量由 hive 决定，具体取决于您分配的字节数或您使用的查询类型（使用 count 并只需选择 *）。请在此处查看更多信息。 here

【讨论】：

当我在升级前运行相同的查询时，它需要 120 个减速器，升级后它需要 1100 个减速器。升级前后配置参数（ive.exec.reducers.bytes.per.reducer）是否一样？ This post 也可以帮助您了解更多。快乐学习和编码

以上是关于Hive 查询使用过多的 reducer 运行的主要内容，如果未能解决你的问题，请参考以下文章