Spark-submit提交任务,封装shell脚本,shell脚本创建HBase表,批量导hive表

Posted 岁月的眸

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Spark-submit提交任务,封装shell脚本,shell脚本创建HBase表,批量导hive表相关的知识,希望对你有一定的参考价值。

使用Spark-submit提交Spark任务封装shell脚本

# =====集群上local 启动模式======
#!/bin/bash

if [ $# -eq 1 ];then
        spark-submit --master local[4]  --class hx.com.Ods2DwdFilterSql --files /home/etl_admin/spark/config.properties sparkDwdFilter-1.0-SNAPSHOT.jar $1
else
  echo "Please input command. eg: ./$0 filename.sql(hql)"
fi
# =====yarn-client 启动模式=======

#!/bin/bash

if [ $# -eq 1 ];then
        spark-submit \\
        --master yarn \\
        --deploy-mode client \\
        --queue default \\
        --driver-memory 2g \\
        --num-executors 3 \\
        --executor-memory 2g \\
        --executor-cores 2 \\
        --class hx.com.Ods2DwdFilterSql \\
        --files /home/etl_admin/spark/config.properties \\
        sparkDwdFilter-1.0-SNAPSHOT.jar /opt/etl/sqlFiles/$1 
else
  echo "Please input command. eg: ./$0 filename.sql(hql)"
fi

# =======yarn-cluster 启动模式=======

#!/bin/bash

if [ $# -eq 1 ];then
        spark-submit \\
        --master yarn \\
        --deploy-mode cluster \\
        --queue default \\
        --driver-memory 2g \\
        --num-executors 3 \\
        --executor-memory 2g \\
        --executor-cores 2 \\
        --class hx.com.Ods2DwdFilterSql \\
        --files /home/etl_admin/spark/config.properties \\
        sparkDwdFilter-1.0-SNAPSHOT.jar /opt/etl/sqlFiles/$1 
else
  echo "Please input command. eg: ./$0 filename.sql(hql)"
fi

shell脚本创建hbase表

  • create_hbase_table.sh
#!/bin/bash
#1,判断执行脚本时,是否输入正确的参数
[[ $# < 1 ]] && echo "请输入hbase表名!" &&  exit 1

#2,定义hbase表
#hbase_table="'test111'"
hbase_table="'$1'"
cf="'f'"
echo "输入的hbase表 ==>  $hbase_table $cf"

#3,判断表是否已存在
#echo "exists   $hbase_table  "| hbase shell  |grep 'does exist'
echo `hbase shell <<EOF
exists $hbase_table
EOF` |grep 'does exist' 

status=$?
echo "exists ? ==> status=$status"
if [ $status -eq 0 ];then
        echo "table exists..."; exit 0
fi
#创建表
echo "table not exists ! ===> start creating...."
#echo " create  $hbase_table , $cf "  |hbase shell 
hbase shell <<EOF
create $hbase_table , $cf 
EOF

#获取结果
status=$?
if [ $status -eq 0  ];then 
   echo "create succeed ! "
else
   echo "error,create not done !"
fi

批量导出hive表

#!/bin/bash

# 要批量导出建表语句的数据库 
#DATABASES='ods edw dws'
DATABASES='gmall'
for DATABASE in $DATABASES
do 
hive -e "use $DATABASE; show tables;" > $DATABASE_tables.txt
sed -i '/WARN:/d' $DATABASE_tables.txt
#sleep 1
echo "use $DATABASE;" >> $DATABASE_repair_tables.sql
echo "set hive.msck.path.validation=ignore;" >> $DATABASE_repair_tables.sql
echo "use $DATABASE;" >> $DATABASE_count.sql

cat $DATABASE_tables.txt | while read eachline
do
hive -e "use $DATABASE; show create table $eachline;" >> $DATABASE_tables_ddl.sql

echo "msck repair table $DATABASE.$eachline;" >> $DATABASE_repair_tables.sql
echo "select count(1) from $DATABASE.$eachline union all " >> $DATABASE_count.sql

done
sed -i '/WARN:/d'  $DATABASE_tables_ddl.sql
sed -i "s/\\`/ /g"  $DATABASE_tables_ddl.sql
cat $DATABASE_tables.txt | while read eachtable
do

sed -i "s/CREATE EXTERNAL TABLE  $eachtable/;CREATE EXTERNAL TABLE $DATABASE.$eachtable_bak/g"  $DATABASE_tables_ddl.sql
done

echo ";" >> $DATABASE_tables_ddl.sql
done

以上是关于Spark-submit提交任务,封装shell脚本,shell脚本创建HBase表,批量导hive表的主要内容,如果未能解决你的问题,请参考以下文章

Spark-源码-Spark-Submit 任务提交

Spark学习之路 (十六)SparkCore的源码解读spark-submit提交脚本

Spark学习之路 (十六)SparkCore的源码解读spark-submit提交脚本[转]

spark-submit参数

spark提交参数解析

Spark-submit 测试任务提交