Spark-submit提交任务,封装shell脚本,shell脚本创建HBase表,批量导hive表
Posted 岁月的眸
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Spark-submit提交任务,封装shell脚本,shell脚本创建HBase表,批量导hive表相关的知识,希望对你有一定的参考价值。
使用Spark-submit提交Spark任务封装shell脚本
# =====集群上local 启动模式======
#!/bin/bash
if [ $# -eq 1 ];then
spark-submit --master local[4] --class hx.com.Ods2DwdFilterSql --files /home/etl_admin/spark/config.properties sparkDwdFilter-1.0-SNAPSHOT.jar $1
else
echo "Please input command. eg: ./$0 filename.sql(hql)"
fi
# =====yarn-client 启动模式=======
#!/bin/bash
if [ $# -eq 1 ];then
spark-submit \\
--master yarn \\
--deploy-mode client \\
--queue default \\
--driver-memory 2g \\
--num-executors 3 \\
--executor-memory 2g \\
--executor-cores 2 \\
--class hx.com.Ods2DwdFilterSql \\
--files /home/etl_admin/spark/config.properties \\
sparkDwdFilter-1.0-SNAPSHOT.jar /opt/etl/sqlFiles/$1
else
echo "Please input command. eg: ./$0 filename.sql(hql)"
fi
# =======yarn-cluster 启动模式=======
#!/bin/bash
if [ $# -eq 1 ];then
spark-submit \\
--master yarn \\
--deploy-mode cluster \\
--queue default \\
--driver-memory 2g \\
--num-executors 3 \\
--executor-memory 2g \\
--executor-cores 2 \\
--class hx.com.Ods2DwdFilterSql \\
--files /home/etl_admin/spark/config.properties \\
sparkDwdFilter-1.0-SNAPSHOT.jar /opt/etl/sqlFiles/$1
else
echo "Please input command. eg: ./$0 filename.sql(hql)"
fi
shell脚本创建hbase表
- create_hbase_table.sh
#!/bin/bash
#1,判断执行脚本时,是否输入正确的参数
[[ $# < 1 ]] && echo "请输入hbase表名!" && exit 1
#2,定义hbase表
#hbase_table="'test111'"
hbase_table="'$1'"
cf="'f'"
echo "输入的hbase表 ==> $hbase_table $cf"
#3,判断表是否已存在
#echo "exists $hbase_table "| hbase shell |grep 'does exist'
echo `hbase shell <<EOF
exists $hbase_table
EOF` |grep 'does exist'
status=$?
echo "exists ? ==> status=$status"
if [ $status -eq 0 ];then
echo "table exists..."; exit 0
fi
#创建表
echo "table not exists ! ===> start creating...."
#echo " create $hbase_table , $cf " |hbase shell
hbase shell <<EOF
create $hbase_table , $cf
EOF
#获取结果
status=$?
if [ $status -eq 0 ];then
echo "create succeed ! "
else
echo "error,create not done !"
fi
批量导出hive表
#!/bin/bash
# 要批量导出建表语句的数据库
#DATABASES='ods edw dws'
DATABASES='gmall'
for DATABASE in $DATABASES
do
hive -e "use $DATABASE; show tables;" > $DATABASE_tables.txt
sed -i '/WARN:/d' $DATABASE_tables.txt
#sleep 1
echo "use $DATABASE;" >> $DATABASE_repair_tables.sql
echo "set hive.msck.path.validation=ignore;" >> $DATABASE_repair_tables.sql
echo "use $DATABASE;" >> $DATABASE_count.sql
cat $DATABASE_tables.txt | while read eachline
do
hive -e "use $DATABASE; show create table $eachline;" >> $DATABASE_tables_ddl.sql
echo "msck repair table $DATABASE.$eachline;" >> $DATABASE_repair_tables.sql
echo "select count(1) from $DATABASE.$eachline union all " >> $DATABASE_count.sql
done
sed -i '/WARN:/d' $DATABASE_tables_ddl.sql
sed -i "s/\\`/ /g" $DATABASE_tables_ddl.sql
cat $DATABASE_tables.txt | while read eachtable
do
sed -i "s/CREATE EXTERNAL TABLE $eachtable/;CREATE EXTERNAL TABLE $DATABASE.$eachtable_bak/g" $DATABASE_tables_ddl.sql
done
echo ";" >> $DATABASE_tables_ddl.sql
done
以上是关于Spark-submit提交任务,封装shell脚本,shell脚本创建HBase表,批量导hive表的主要内容,如果未能解决你的问题,请参考以下文章
Spark学习之路 (十六)SparkCore的源码解读spark-submit提交脚本