HIVE数据类型INT溢出,INT转BIGINT

Posted 小基基o_O

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了HIVE数据类型INT溢出,INT转BIGINT相关的知识,希望对你有一定的参考价值。

文章目录

HIVE版本

ll $HIVE_HOME/lib | grep hive

查到HIVE版本为3.1.2

INT范围

4-byte signed integer
-2,147,483,648~2,147,483,647
2147483648 = 2 31 2147483648=2^31 2147483648=231

INT溢出为负数

SELECT 2147483647+1;

结果:-2147483648

上面负数结果通常不是我们想要的,解决方法:把INTBIGINT

SELECT CAST(2147483647 AS BIGINT)+1;

结果:2147483648

INT溢出为NULL

echo 2147483647,2 > /home/yellow/a.txt
echo 2147483648,2 >> /home/yellow/a.txt
echo -2147483648,2 >> /home/yellow/a.txt
echo -2147483649,2 >> /home/yellow/a.txt
cat /home/yellow/a.txt
-- 建表
CREATE TABLE t7(a INT,b INT) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';
-- 数据导入
LOAD DATA LOCAL INPATH  '/home/yellow/a.txt' OVERWRITE INTO TABLE t7;
-- 查询
SELECT
  a
  ,b
  ,a+b
  -- ,a-b
  ,CAST(a AS BIGINT)+b
  ,CAST(a+b AS BIGINT)
  ,a*b
  ,CAST(a AS BIGINT)*b
  -- ,a/b
FROM t7;

头两行结果

aba+bCAST(a AS BIGINT)+bCAST(a+b AS BIGINT)a*bCAST(a AS BIGINT)*b
21474836472-21474836472147483649-2147483647-24294967294
NULL2NULLNULLNULLNULLNULL

结论
CAST(a+b AS BIGINT)冇用
CAST(a AS BIGINT)+b有用
2147483648越界,查询结果为NULL


补充

SUM不会出现越界,会自动转bigint

echo 2147483647,2 > /home/yellow/a.txt
echo 2147483647,2 >> /home/yellow/a.txt
cat /home/yellow/a.txt
-- 数据导入
LOAD DATA LOCAL INPATH  '/home/yellow/a.txt' OVERWRITE INTO TABLE t7;
-- 查询
SELECT SUM(a) FROM t7;
-- 执行计划
EXPLAIN SELECT SUM(a) FROM t7;

查询结果:4294967294

Explain
STAGE DEPENDENCIES:
  Stage-1 is a root stage
  Stage-0 depends on stages: Stage-1

STAGE PLANS:
  Stage: Stage-1
    Spark
      Edges:
        Reducer 2 <- Map 1 (GROUP, 1)
      DagName: 用户名_年月日时分秒_xxxxxxxxxx:提交次数
      Vertices:
        Map 1 
            Map Operator Tree:
                TableScan
                  alias: t7
                  Statistics: Num rows: 1 Data size: 260 Basic stats: COMPLETE Column stats: NONE
                  Select Operator
                    expressions: a (type: int)
                    outputColumnNames: a
                    Statistics: Num rows: 1 Data size: 260 Basic stats: COMPLETE Column stats: NONE
                    Group By Operator
                      aggregations: sum(a)
                      mode: hash
                      outputColumnNames: _col0
                      Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE
                      Reduce Output Operator
                        sort order: 
                        Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE
                        value expressions: _col0 (type: bigint)
            Execution mode: vectorized
        Reducer 2 
            Execution mode: vectorized
            Reduce Operator Tree:
              Group By Operator
                aggregations: sum(VALUE._col0)
                mode: mergepartial
                outputColumnNames: _col0
                Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE
                File Output Operator
                  compressed: false
                  Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE
                  table:
                      input format: org.apache.hadoop.mapred.SequenceFileInputFormat
                      output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
                      serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe

  Stage: Stage-0
    Fetch Operator
      limit: -1
      Processor Tree:
        ListSink

以上是关于HIVE数据类型INT溢出,INT转BIGINT的主要内容,如果未能解决你的问题,请参考以下文章

Hive数据类型

(SQL)将 expression 转换为数据类型 int 时发生算术溢出错误

将expression转化为数据类型int时发生算术溢出错误

hive 数据类型

Hive-数据类型

Hive支持的数据类型