hive 1.1.0 动态分区实现

Posted jiangxiaoxian

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了hive 1.1.0 动态分区实现相关的知识,希望对你有一定的参考价值。

hive (public)> explain insert overwrite table public_t_par partition(delivery_datekey) select * from public_oi_fact_partition;
OK
STAGE DEPENDENCIES:
  Stage-1 is a root stage
  Stage-7 depends on stages: Stage-1 , consists of Stage-4, Stage-3, Stage-5
  Stage-4
  Stage-0 depends on stages: Stage-4, Stage-3, Stage-6
  Stage-2 depends on stages: Stage-0
  Stage-3
  Stage-5
  Stage-6 depends on stages: Stage-5
STAGE PLANS:
  Stage: Stage-1
    Map Reduce
      Map Operator Tree:
          TableScan
            alias: public_oi_fact_partition
            Statistics: Num rows: 110000 Data size: 35162211 Basic stats: COMPLETE Column stats: NONE
            Select Operator
              expressions: order_datekey (type: int), oiid (type: bigint), custom_order_id (type: bigint), ciid (type: int), bi_name (type: string), siid (type: int), si_name (type: string), classify (type: string), status (type: int), status_text (type: string), class1_id (type: int), class1_name (type: string), class2_id (type: int), class2_name (type: string), city_id (type: int), city_name (type: string), operate_area (type: int), company_id (type: int), standard_item_num (type: int), package_num (type: double), expect_num (type: decimal(30,6)), price (type: decimal(30,6)), order_weight (type: decimal(30,6)), order_amount (type: decimal(30,6)), order_money (type: decimal(30,6)), ci_weight (type: decimal(30,6)), c_t (type: string), u_t (type: string), real_num (type: decimal(30,6)), real_weight (type: decimal(30,6)), real_money (type: decimal(30,6)), cost_price (type: decimal(30,6)), cost_money (type: decimal(30,6)), price_unit (type: string), order_money_coupon (type: decimal(30,6)), real_money_coupon (type: decimal(30,6)), real_price (type: decimal(30,6)), f_activity (type: int), activity_type (type: tinyint), is_activity (type: tinyint), original_price (type: decimal(30,6)), car_group_id (type: bigint), driver_id (type: string), expect_pay_way (type: int), desc (type: string), coupon_score_amount (type: decimal(30,6)), sale_area_id (type: int), delivery_area_id (type: int), tag (type: int), promote_tag_id (type: bigint), promote_tag_name (type: string), pop_id (type: bigint), delivery_datekey (type: string)
              outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14, _col15, _col16, _col17, _col18, _col19, _col20, _col21, _col22, _col23, _col24, _col25, _col26, _col27, _col28, _col29, _col30, _col31, _col32, _col33, _col34, _col35, _col36, _col37, _col38, _col39, _col40, _col41, _col42, _col43, _col44, _col45, _col46, _col47, _col48, _col49, _col50, _col51, _col52
              Statistics: Num rows: 110000 Data size: 35162211 Basic stats: COMPLETE Column stats: NONE
              File Output Operator
                compressed: false
                Statistics: Num rows: 110000 Data size: 35162211 Basic stats: COMPLETE Column stats: NONE
                table:
                    input format: org.apache.hadoop.mapred.TextInputFormat
                    output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
                    serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
                    name: public.public_t_par
  Stage: Stage-7
    Conditional Operator
  Stage: Stage-4
    Move Operator
      files:
          hdfs directory: true
          destination: hdfs://ns1/user/hive/warehouse/public.db/public_t_par/.hive-staging_hive_2018-06-08_15-41-18_222_4176438830382881060-1/-ext-10000
  Stage: Stage-0
    Move Operator
      tables:
          partition:
            delivery_datekey 
          replace: true
          table:
              input format: org.apache.hadoop.mapred.TextInputFormat
              output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
              serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
              name: public.public_t_par
  Stage: Stage-2
    Stats-Aggr Operator
  Stage: Stage-3
    Map Reduce
      Map Operator Tree:
          TableScan
            File Output Operator
              compressed: false
              table:
                  input format: org.apache.hadoop.mapred.TextInputFormat
                  output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
                  serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
                  name: public.public_t_par
  Stage: Stage-5
    Map Reduce
      Map Operator Tree:
          TableScan
            File Output Operator
              compressed: false
              table:
                  input format: org.apache.hadoop.mapred.TextInputFormat
                  output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
                  serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
                  name: public.public_t_par
  Stage: Stage-6
    Move Operator
      files:
          hdfs directory: true
          destination: hdfs://ns1/user/hive/warehouse/public.db/public_t_par/.hive-staging_hive_2018-06-08_15-41-18_222_4176438830382881060-1/-ext-10000

uploading-image-422377.png
uploading-image-83430.png

目录都完成后_tmp.-ext-1000会变成-ext-1000 并参见stage-6


以上是关于hive 1.1.0 动态分区实现的主要内容,如果未能解决你的问题,请参考以下文章

使用Presto实现Hive动态分区

使用Presto实现Hive动态分区

Hive动态分区与静态分区,数据插入,区别

使用Hive SQL插入动态分区的Parquet表OOM异常分析

Hive的分区表与分桶表&内部表外部表

Hive 动态分区