为啥我在 Azkaban 中的 Sqoop 任务在选择列后卡住了?

Posted

技术标签:

【中文标题】为啥我在 Azkaban 中的 Sqoop 任务在选择列后卡住了?【英文标题】:Why is my Sqoop task in Azkaban stuck after columns are selected?为什么我在 Azkaban 中的 Sqoop 任务在选择列后卡住了? 【发布时间】:2017-08-07 03:39:53 【问题描述】:

我在 Azkaban 中使用 shell 命令,并将 Sqoop 命令放在 shell 脚本中。

今天一个 Sqoop 任务无缘无故卡住了,sqoop_task1

它发生在几天前的另一个 sqoop 任务上,我们称之为 sqoop_task2。

sqoop_task1sqoop_task2 都是导入作业,从 mysql 到 Hive,它们的源 db.table 和目标 db.tabla 完全不同。但问题是一样的。这是日志:

07-08-2017 02:43:21 CST import_user_plan_record INFO - Starting job import_user_plan_record at 1502045001852
07-08-2017 02:43:21 CST import_user_plan_record INFO - azkaban.webserver.url property was not set
07-08-2017 02:43:21 CST import_user_plan_record INFO - job JVM args: -Dazkaban.flowid=m2h_done_xxx_20170807020506 -Dazkaban.execid=26987 -Dazkaban.jobid=import_user_plan_record
07-08-2017 02:43:21 CST import_user_plan_record INFO - Building command job executor. 
07-08-2017 02:43:21 CST import_user_plan_record INFO - 1 commands to execute.
07-08-2017 02:43:21 CST import_user_plan_record INFO - effective user is: azkaban
07-08-2017 02:43:21 CST import_user_plan_record INFO - Command: sh /var/azkaban-metamap/m2h-20170807020501-import_user_plan_record.m2h
07-08-2017 02:43:21 CST import_user_plan_record INFO - Environment variables: JOB_OUTPUT_PROP_FILE=/server/azkaban2.6.4/exec/executions/26987/tmp/m2h-20170807020501/import_user_plan_record_output_11695048929175505_tmp, JOB_PROP_FILE=/server/azkaban2.6.4/exec/executions/26987/tmp/m2h-20170807020501/import_user_plan_record_props_1410808489340719464_tmp, KRB5CCNAME=/tmp/krb5cc__xxx_m2h_day_m2h-20170807020501__m2h_done_xxx_20170807020506__import_user_plan_record__26987__azkaban, JOB_NAME=import_user_plan_record
07-08-2017 02:43:21 CST import_user_plan_record INFO - Working directory: /server/azkaban2.6.4/exec/executions/26987/tmp/m2h-20170807020501
07-08-2017 02:43:22 CST import_user_plan_record INFO - Warning: /usr/hdp/2.4.2.0-258/accumulo does not exist! Accumulo imports will fail.
07-08-2017 02:43:22 CST import_user_plan_record INFO - Please set $ACCUMULO_HOME to the root of your Accumulo installation.
07-08-2017 02:43:28 CST import_user_plan_record INFO - 17/08/07 02:43:28 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6.2.4.2.0-258
07-08-2017 02:43:28 CST import_user_plan_record INFO - 17/08/07 02:43:28 DEBUG tool.BaseSqoopTool: Enabled debug logging.
07-08-2017 02:43:28 CST import_user_plan_record INFO - 17/08/07 02:43:28 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
07-08-2017 02:43:28 CST import_user_plan_record INFO - 17/08/07 02:43:28 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override
07-08-2017 02:43:28 CST import_user_plan_record INFO - 17/08/07 02:43:28 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc.
07-08-2017 02:43:28 CST import_user_plan_record INFO - 17/08/07 02:43:28 DEBUG sqoop.ConnFactory: Loaded manager factory: org.apache.sqoop.manager.oracle.OraOopManagerFactory
07-08-2017 02:43:29 CST import_user_plan_record INFO - 17/08/07 02:43:29 DEBUG sqoop.ConnFactory: Loaded manager factory: com.cloudera.sqoop.manager.DefaultManagerFactory
07-08-2017 02:43:29 CST import_user_plan_record INFO - 17/08/07 02:43:29 WARN sqoop.ConnFactory: Parameter --driver is set to an explicit driver however appropriate connection manager is not being set (via --connection-manager). Sqoop is going to fall back to org.apache.sqoop.manager.GenericJdbcManager. Please specify explicitly which connection manager should be used next time.
07-08-2017 02:43:29 CST import_user_plan_record INFO - 17/08/07 02:43:29 INFO manager.SqlManager: Using default fetchSize of 1000
07-08-2017 02:43:29 CST import_user_plan_record INFO - 17/08/07 02:43:29 INFO tool.CodeGenTool: Beginning code generation
07-08-2017 02:43:29 CST import_user_plan_record INFO - 17/08/07 02:43:29 DEBUG manager.SqlManager: Execute getColumnInfoRawQuery : SELECT t.* FROM user_plan_record AS t WHERE 1=0
07-08-2017 02:43:30 CST import_user_plan_record INFO - 17/08/07 02:43:30 DEBUG manager.SqlManager: No connection paramenters specified. Using regular API for making connection.
07-08-2017 02:43:31 CST import_user_plan_record INFO - 17/08/07 02:43:31 DEBUG manager.SqlManager: Using fetchSize for next query: 1000
07-08-2017 02:43:31 CST import_user_plan_record INFO - 17/08/07 02:43:31 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM user_plan_record AS t WHERE 1=0
07-08-2017 02:43:31 CST import_user_plan_record INFO - 17/08/07 02:43:31 DEBUG manager.SqlManager: Found column id of type [-5, 20, 0]
07-08-2017 02:43:31 CST import_user_plan_record INFO - 17/08/07 02:43:31 DEBUG manager.SqlManager: Found column user_id of type [12, 50, 0]
07-08-2017 02:43:31 CST import_user_plan_record INFO - 17/08/07 02:43:31 DEBUG manager.SqlManager: Found column plan_id of type [-5, 20, 0]
07-08-2017 02:43:31 CST import_user_plan_record INFO - 17/08/07 02:43:31 DEBUG manager.SqlManager: Found column largeclass_id of type [-5, 20, 0]
07-08-2017 02:43:31 CST import_user_plan_record INFO - 17/08/07 02:43:31 DEBUG manager.SqlManager: Found column subclass_name of type [12, 200, 0]
07-08-2017 02:43:31 CST import_user_plan_record INFO - 17/08/07 02:43:31 DEBUG manager.SqlManager: Found column dream_amount of type [3, 14, 2]
07-08-2017 02:43:31 CST import_user_plan_record INFO - 17/08/07 02:43:31 DEBUG manager.SqlManager: Found column create_time of type [93, 19, 0]
07-08-2017 02:43:31 CST import_user_plan_record INFO - 17/08/07 02:43:31 DEBUG manager.SqlManager: Found column update_time of type [93, 19, 0]
07-08-2017 02:43:31 CST import_user_plan_record INFO - 17/08/07 02:43:31 DEBUG manager.SqlManager: Found column yn of type [4, 1, 0]
07-08-2017 02:43:31 CST import_user_plan_record INFO - 17/08/07 02:43:31 DEBUG manager.SqlManager: Found column version of type [4, 11, 0]
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG manager.SqlManager: Using fetchSize for next query: 1000
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM user_plan_record AS t WHERE 1=0
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG manager.SqlManager: Found column id
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG manager.SqlManager: Found column user_id
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG manager.SqlManager: Found column plan_id
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG manager.SqlManager: Found column largeclass_id
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG manager.SqlManager: Found column subclass_name
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG manager.SqlManager: Found column dream_amount
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG manager.SqlManager: Found column create_time
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG manager.SqlManager: Found column update_time
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG manager.SqlManager: Found column yn
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG manager.SqlManager: Found column version
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG orm.ClassWriter: selected columns:
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG orm.ClassWriter:   id
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG orm.ClassWriter:   user_id
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG orm.ClassWriter:   plan_id
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG orm.ClassWriter:   largeclass_id
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG orm.ClassWriter:   subclass_name
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG orm.ClassWriter:   dream_amount
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG orm.ClassWriter:   create_time
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG orm.ClassWriter:   update_time
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG orm.ClassWriter:   yn
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG orm.ClassWriter:   version
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG orm.ClassWriter: Writing source file: /server/app/sqoop/vo/user_plan_record.java
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG orm.ClassWriter: Table name: user_plan_record
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG orm.ClassWriter: Columns: id:-5, user_id:12, plan_id:-5, largeclass_id:-5, subclass_name:12, dream_amount:3, create_time:93, update_time:93, yn:4, version:4, 
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG orm.ClassWriter: sourceFilename is user_plan_record.java
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG orm.CompilationManager: Found existing /server/app/sqoop/vo/
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/hdp/2.4.2.0-258/hadoop-mapreduce
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG orm.CompilationManager: Returning jar file path /usr/hdp/2.4.2.0-258/hadoop-mapreduce/hadoop-mapreduce-client-core.jar:/usr/hdp/2.4.2.0-258/hadoop-mapreduce/hadoop-mapreduce-client-core-2.7.1.2.4.2.0-258.jar
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG orm.CompilationManager: Current sqoop classpath = 。。。。。。。。。。。
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG orm.CompilationManager: Adding source file: /server/app/sqoop/vo/user_plan_record.java
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG orm.CompilationManager: Invoking javac with args:
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG orm.CompilationManager:   -sourcepath
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG orm.CompilationManager:   /server/app/sqoop/vo/
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG orm.CompilationManager:   -d
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG orm.CompilationManager:   /server/app/sqoop/vo/
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG orm.CompilationManager:   -classpath
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG orm.CompilationManager:   。。。。。。。。。。。。。。。。。。。
07-08-2017 08:43:01 CST import_user_plan_record ERROR - Kill has been called.
07-08-2017 08:43:01 CST import_user_plan_record INFO - Process completed unsuccessfully in 21579 seconds.
07-08-2017 08:43:01 CST import_user_plan_record ERROR - Job run failed!
java.lang.RuntimeException: azkaban.jobExecutor.utils.process.ProcessFailureException

在其类路径被打印之后或期间,它被卡住了。

以前有人遇到过这个问题吗?

【问题讨论】:

【参考方案1】:

有时我遇到这种情况,Azkaban 日志没有提供任何失败的原因。我所做的是,我检查了任务的纱线日志,我可以找到失败的原因。

【讨论】:

以上是关于为啥我在 Azkaban 中的 Sqoop 任务在选择列后卡住了?的主要内容,如果未能解决你的问题,请参考以下文章

大数据集群服务安装-mysql,hive,azkaban,sqoop,spark,python

大数据调度平台分类(Oozie/Azkaban/AirFlow/DolphinScheduler)

Azkaban安装

Azkaban2.5.0安装

Azkaban简单定时任务-使用教程

azkaban概述