HIVE数据类型INT溢出,INT转BIGINT
Posted 小基基o_O
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了HIVE数据类型INT溢出,INT转BIGINT相关的知识,希望对你有一定的参考价值。
文章目录
HIVE版本
ll $HIVE_HOME/lib | grep hive
查到HIVE版本为3.1.2
INT范围
INT范围:-2,147,483,648
~2,147,483,647
2147483648
=
2
31
2147483648=2^31
2147483648=231
INT溢出为负数
SELECT 2147483647+1;
结果:-2147483648
上面负数结果通常不是我们想要的,解决方法:把INT
转BIGINT
SELECT CAST(2147483647 AS BIGINT)+1;
结果:2147483648
HIVE表示范
echo 2147483647,2 > /home/yellow/a.txt
echo 2147483648,2 >> /home/yellow/a.txt
echo -2147483648,2 >> /home/yellow/a.txt
echo -2147483649,2 >> /home/yellow/a.txt
cat /home/yellow/a.txt
-- 建表
CREATE TABLE t7(a INT,b INT) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';
-- 数据导入
LOAD DATA LOCAL INPATH '/home/yellow/a.txt' OVERWRITE INTO TABLE t7;
-- 查询
SELECT
a
,b
,a+b
-- ,a-b
,CAST(a AS BIGINT)+b
,CAST(a+b AS BIGINT)
,a*b
,CAST(a AS BIGINT)*b
-- ,a/b
FROM t7;
头两行结果
a | b | a+b | CAST(a AS BIGINT)+b | CAST(a+b AS BIGINT) | a*b | CAST(a AS BIGINT)*b |
---|---|---|---|---|---|---|
2147483647 | 2 | -2147483647 | 2147483649 | -2147483647 | -2 | 4294967294 |
NULL | 2 | NULL | NULL | NULL | NULL | NULL |
结论
CAST(a+b AS BIGINT)
冇用
CAST(a AS BIGINT)+b
有用
2147483648越界,查询结果为NULL
补充
SUM不会出现越界,会自动转bigint
echo 2147483647,2 > /home/yellow/a.txt
echo 2147483647,2 >> /home/yellow/a.txt
cat /home/yellow/a.txt
-- 数据导入
LOAD DATA LOCAL INPATH '/home/yellow/a.txt' OVERWRITE INTO TABLE t7;
-- 查询
SELECT SUM(a) FROM t7;
-- 执行计划
EXPLAIN SELECT SUM(a) FROM t7;
查询结果:4294967294
Explain
STAGE DEPENDENCIES:
Stage-1 is a root stage
Stage-0 depends on stages: Stage-1
STAGE PLANS:
Stage: Stage-1
Spark
Edges:
Reducer 2 <- Map 1 (GROUP, 1)
DagName: yellow_20220409151330_33ff47f5-4501-48fe-85b4-187ca4cb985c:8
Vertices:
Map 1
Map Operator Tree:
TableScan
alias: t7
Statistics: Num rows: 1 Data size: 260 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: a (type: int)
outputColumnNames: a
Statistics: Num rows: 1 Data size: 260 Basic stats: COMPLETE Column stats: NONE
Group By Operator
aggregations: sum(a)
mode: hash
outputColumnNames: _col0
Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
sort order:
Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE
value expressions: _col0 (type: bigint)
Execution mode: vectorized
Reducer 2
Execution mode: vectorized
Reduce Operator Tree:
Group By Operator
aggregations: sum(VALUE._col0)
mode: mergepartial
outputColumnNames: _col0
Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE
File Output Operator
compressed: false
Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE
table:
input format: org.apache.hadoop.mapred.SequenceFileInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
Stage: Stage-0
Fetch Operator
limit: -1
Processor Tree:
ListSink
以上是关于HIVE数据类型INT溢出,INT转BIGINT的主要内容,如果未能解决你的问题,请参考以下文章
(SQL)将 expression 转换为数据类型 int 时发生算术溢出错误