记录 odps(hive) 的oom

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了记录 odps(hive) 的oom相关的知识,希望对你有一定的参考价值。

参考技术A FAILED: ODPS-0123144: Fuxi job failed - WorkerRestart errCode:9,errMsg:SigKill(OOM), usually caused by OOM(out of memory). 

看 logviw 发现 在map 阶段出现 oom

```

with userdaystat as (

    select *  from sync_mongo_box.extract_source__userdaystat

    WHERE  pt = '$bdate' and (total_cash >0 or total_flash >0)

)

INSERT OVERWRITE TABLE odps_product_box_subsidy_detail_top100 PARTITION(pt='$bdate' )  

SELECT user_id

        ,DOUBLE(extract_cash_back)

        ,'提现3元奖励'

        ,'extract_cash_back'

FROM    userdaystat

WHERE  pt = '$bdate'

ORDER BY extract_cash_back DESC

LIMIT  100

UNION ALL

SELECT  user_id

        ,DOUBLE(family_flash)

        ,'家族奖励'

        ,'family_flash'

FROM    userdaystat

WHERE  pt = '$bdate'

ORDER BY family_flash DESC

LIMIT  100

UNION ALL

SELECT  user_id

        ,DOUBLE(master_recall_flash)

        ,'唤醒徒弟奖励'

        ,'master_recall_flash'

FROM    userdaystat

WHERE  pt = '$bdate'

ORDER BY master_recall_flash DESC

LIMIT  100

UNION ALL

SELECT  user_id

        ,DOUBLE(recall_apprentice_flash)

        ,'被唤醒奖励'

        ,'recall_apprentice_flash'

FROM    userdaystat

WHERE  pt = '$bdate'

ORDER BY recall_apprentice_flash DESC

LIMIT  100

UNION ALL

SELECT  user_id

        ,DOUBLE(master_benefit_flash)

        ,'师父奖励'

        ,'master_benefit_flash'

FROM    userdaystat

WHERE  pt = '$bdate'

ORDER BY master_benefit_flash DESC

LIMIT  100

UNION ALL

SELECT  user_id

        ,DOUBLE(share_income_flash)

        ,'晒收入奖励'

        ,'share_income_flash'

FROM    userdaystat

WHERE  pt = '$bdate'

ORDER BY share_income_flash DESC

.... etc.....

UNION ALL

SELECT  user_id

        ,DOUBLE(total_cash)

        ,'总补贴现金'

        ,'total_cash'

FROM    userdaystat

WHERE  pt = '$bdate'

ORDER BY total_cash DESC

LIMIT  100

```

大量 的 UNION ALL ,子查询读入数据过多 减少 子查询 的数据读入量

```

with userdaystat as (

    select *  from sync_mongo_box.extract_source__userdaystat

    WHERE  pt = '$bdate'

)

INSERT OVERWRITE TABLE odps_product_box_subsidy_detail_top100 PARTITION(pt='$bdate' )

SELECT  user_id

        ,DOUBLE(extract_cash_back)

        ,'提现3元奖励'

        ,'extract_cash_back'

FROM    userdaystat

WHERE  pt = '$bdate'  and  extract_cash_back is not null

ORDER BY extract_cash_back DESC

LIMIT  100

UNION ALL

SELECT  user_id

        ,DOUBLE(family_flash)

        ,'家族奖励'

        ,'family_flash'

FROM    userdaystat

WHERE  pt = '$bdate' and family_flash is not null

ORDER BY family_flash DESC

LIMIT  100

UNION ALL

....etc....

FROM userdaystat

WHERE  pt = '$bdate' and total_flash is not null

ORDER BY total_flash DESC

LIMIT  100

UNION ALL

SELECT  user_id

        ,DOUBLE(total_cash)

        ,'总补贴现金'

        ,'total_cash'

FROM    userdaystat

WHERE  pt = '$bdate'  and total_cash is not null

ORDER BY total_cash DESC

LIMIT  100

```

为每个 子查询 减少读入数据量 ,添加 对应字段  is not null 。

以上是关于记录 odps(hive) 的oom的主要内容,如果未能解决你的问题,请参考以下文章

Hive插入多个分区时OOM故障解决记录

HDFS oiv解析Fsimage OOM异常处理

ODPS MapReduce(MR2)用例记录

通过 ODP 将记录从 pl-sql 传递到 C#

MaxCompute(ODPS):Hive的进阶者

PermGen space OOM 记录