为啥我在 Azkaban 中的 Sqoop 任务在选择列后卡住了?
Posted
技术标签:
【中文标题】为啥我在 Azkaban 中的 Sqoop 任务在选择列后卡住了?【英文标题】:Why is my Sqoop task in Azkaban stuck after columns are selected?为什么我在 Azkaban 中的 Sqoop 任务在选择列后卡住了? 【发布时间】:2017-08-07 03:39:53 【问题描述】:我在 Azkaban 中使用 shell 命令,并将 Sqoop 命令放在 shell 脚本中。
今天一个 Sqoop 任务无缘无故卡住了,sqoop_task1
。
它发生在几天前的另一个 sqoop 任务上,我们称之为 sqoop_task2。
sqoop_task1
和sqoop_task2
都是导入作业,从 mysql 到 Hive,它们的源 db.table 和目标 db.tabla 完全不同。但问题是一样的。这是日志:
07-08-2017 02:43:21 CST import_user_plan_record INFO - Starting job import_user_plan_record at 1502045001852
07-08-2017 02:43:21 CST import_user_plan_record INFO - azkaban.webserver.url property was not set
07-08-2017 02:43:21 CST import_user_plan_record INFO - job JVM args: -Dazkaban.flowid=m2h_done_xxx_20170807020506 -Dazkaban.execid=26987 -Dazkaban.jobid=import_user_plan_record
07-08-2017 02:43:21 CST import_user_plan_record INFO - Building command job executor.
07-08-2017 02:43:21 CST import_user_plan_record INFO - 1 commands to execute.
07-08-2017 02:43:21 CST import_user_plan_record INFO - effective user is: azkaban
07-08-2017 02:43:21 CST import_user_plan_record INFO - Command: sh /var/azkaban-metamap/m2h-20170807020501-import_user_plan_record.m2h
07-08-2017 02:43:21 CST import_user_plan_record INFO - Environment variables: JOB_OUTPUT_PROP_FILE=/server/azkaban2.6.4/exec/executions/26987/tmp/m2h-20170807020501/import_user_plan_record_output_11695048929175505_tmp, JOB_PROP_FILE=/server/azkaban2.6.4/exec/executions/26987/tmp/m2h-20170807020501/import_user_plan_record_props_1410808489340719464_tmp, KRB5CCNAME=/tmp/krb5cc__xxx_m2h_day_m2h-20170807020501__m2h_done_xxx_20170807020506__import_user_plan_record__26987__azkaban, JOB_NAME=import_user_plan_record
07-08-2017 02:43:21 CST import_user_plan_record INFO - Working directory: /server/azkaban2.6.4/exec/executions/26987/tmp/m2h-20170807020501
07-08-2017 02:43:22 CST import_user_plan_record INFO - Warning: /usr/hdp/2.4.2.0-258/accumulo does not exist! Accumulo imports will fail.
07-08-2017 02:43:22 CST import_user_plan_record INFO - Please set $ACCUMULO_HOME to the root of your Accumulo installation.
07-08-2017 02:43:28 CST import_user_plan_record INFO - 17/08/07 02:43:28 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6.2.4.2.0-258
07-08-2017 02:43:28 CST import_user_plan_record INFO - 17/08/07 02:43:28 DEBUG tool.BaseSqoopTool: Enabled debug logging.
07-08-2017 02:43:28 CST import_user_plan_record INFO - 17/08/07 02:43:28 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
07-08-2017 02:43:28 CST import_user_plan_record INFO - 17/08/07 02:43:28 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override
07-08-2017 02:43:28 CST import_user_plan_record INFO - 17/08/07 02:43:28 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc.
07-08-2017 02:43:28 CST import_user_plan_record INFO - 17/08/07 02:43:28 DEBUG sqoop.ConnFactory: Loaded manager factory: org.apache.sqoop.manager.oracle.OraOopManagerFactory
07-08-2017 02:43:29 CST import_user_plan_record INFO - 17/08/07 02:43:29 DEBUG sqoop.ConnFactory: Loaded manager factory: com.cloudera.sqoop.manager.DefaultManagerFactory
07-08-2017 02:43:29 CST import_user_plan_record INFO - 17/08/07 02:43:29 WARN sqoop.ConnFactory: Parameter --driver is set to an explicit driver however appropriate connection manager is not being set (via --connection-manager). Sqoop is going to fall back to org.apache.sqoop.manager.GenericJdbcManager. Please specify explicitly which connection manager should be used next time.
07-08-2017 02:43:29 CST import_user_plan_record INFO - 17/08/07 02:43:29 INFO manager.SqlManager: Using default fetchSize of 1000
07-08-2017 02:43:29 CST import_user_plan_record INFO - 17/08/07 02:43:29 INFO tool.CodeGenTool: Beginning code generation
07-08-2017 02:43:29 CST import_user_plan_record INFO - 17/08/07 02:43:29 DEBUG manager.SqlManager: Execute getColumnInfoRawQuery : SELECT t.* FROM user_plan_record AS t WHERE 1=0
07-08-2017 02:43:30 CST import_user_plan_record INFO - 17/08/07 02:43:30 DEBUG manager.SqlManager: No connection paramenters specified. Using regular API for making connection.
07-08-2017 02:43:31 CST import_user_plan_record INFO - 17/08/07 02:43:31 DEBUG manager.SqlManager: Using fetchSize for next query: 1000
07-08-2017 02:43:31 CST import_user_plan_record INFO - 17/08/07 02:43:31 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM user_plan_record AS t WHERE 1=0
07-08-2017 02:43:31 CST import_user_plan_record INFO - 17/08/07 02:43:31 DEBUG manager.SqlManager: Found column id of type [-5, 20, 0]
07-08-2017 02:43:31 CST import_user_plan_record INFO - 17/08/07 02:43:31 DEBUG manager.SqlManager: Found column user_id of type [12, 50, 0]
07-08-2017 02:43:31 CST import_user_plan_record INFO - 17/08/07 02:43:31 DEBUG manager.SqlManager: Found column plan_id of type [-5, 20, 0]
07-08-2017 02:43:31 CST import_user_plan_record INFO - 17/08/07 02:43:31 DEBUG manager.SqlManager: Found column largeclass_id of type [-5, 20, 0]
07-08-2017 02:43:31 CST import_user_plan_record INFO - 17/08/07 02:43:31 DEBUG manager.SqlManager: Found column subclass_name of type [12, 200, 0]
07-08-2017 02:43:31 CST import_user_plan_record INFO - 17/08/07 02:43:31 DEBUG manager.SqlManager: Found column dream_amount of type [3, 14, 2]
07-08-2017 02:43:31 CST import_user_plan_record INFO - 17/08/07 02:43:31 DEBUG manager.SqlManager: Found column create_time of type [93, 19, 0]
07-08-2017 02:43:31 CST import_user_plan_record INFO - 17/08/07 02:43:31 DEBUG manager.SqlManager: Found column update_time of type [93, 19, 0]
07-08-2017 02:43:31 CST import_user_plan_record INFO - 17/08/07 02:43:31 DEBUG manager.SqlManager: Found column yn of type [4, 1, 0]
07-08-2017 02:43:31 CST import_user_plan_record INFO - 17/08/07 02:43:31 DEBUG manager.SqlManager: Found column version of type [4, 11, 0]
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG manager.SqlManager: Using fetchSize for next query: 1000
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM user_plan_record AS t WHERE 1=0
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG manager.SqlManager: Found column id
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG manager.SqlManager: Found column user_id
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG manager.SqlManager: Found column plan_id
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG manager.SqlManager: Found column largeclass_id
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG manager.SqlManager: Found column subclass_name
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG manager.SqlManager: Found column dream_amount
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG manager.SqlManager: Found column create_time
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG manager.SqlManager: Found column update_time
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG manager.SqlManager: Found column yn
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG manager.SqlManager: Found column version
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG orm.ClassWriter: selected columns:
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG orm.ClassWriter: id
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG orm.ClassWriter: user_id
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG orm.ClassWriter: plan_id
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG orm.ClassWriter: largeclass_id
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG orm.ClassWriter: subclass_name
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG orm.ClassWriter: dream_amount
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG orm.ClassWriter: create_time
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG orm.ClassWriter: update_time
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG orm.ClassWriter: yn
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG orm.ClassWriter: version
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG orm.ClassWriter: Writing source file: /server/app/sqoop/vo/user_plan_record.java
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG orm.ClassWriter: Table name: user_plan_record
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG orm.ClassWriter: Columns: id:-5, user_id:12, plan_id:-5, largeclass_id:-5, subclass_name:12, dream_amount:3, create_time:93, update_time:93, yn:4, version:4,
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG orm.ClassWriter: sourceFilename is user_plan_record.java
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG orm.CompilationManager: Found existing /server/app/sqoop/vo/
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/hdp/2.4.2.0-258/hadoop-mapreduce
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG orm.CompilationManager: Returning jar file path /usr/hdp/2.4.2.0-258/hadoop-mapreduce/hadoop-mapreduce-client-core.jar:/usr/hdp/2.4.2.0-258/hadoop-mapreduce/hadoop-mapreduce-client-core-2.7.1.2.4.2.0-258.jar
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG orm.CompilationManager: Current sqoop classpath = 。。。。。。。。。。。
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG orm.CompilationManager: Adding source file: /server/app/sqoop/vo/user_plan_record.java
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG orm.CompilationManager: Invoking javac with args:
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG orm.CompilationManager: -sourcepath
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG orm.CompilationManager: /server/app/sqoop/vo/
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG orm.CompilationManager: -d
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG orm.CompilationManager: /server/app/sqoop/vo/
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG orm.CompilationManager: -classpath
07-08-2017 02:43:32 CST import_user_plan_record INFO - 17/08/07 02:43:32 DEBUG orm.CompilationManager: 。。。。。。。。。。。。。。。。。。。
07-08-2017 08:43:01 CST import_user_plan_record ERROR - Kill has been called.
07-08-2017 08:43:01 CST import_user_plan_record INFO - Process completed unsuccessfully in 21579 seconds.
07-08-2017 08:43:01 CST import_user_plan_record ERROR - Job run failed!
java.lang.RuntimeException: azkaban.jobExecutor.utils.process.ProcessFailureException
在其类路径被打印之后或期间,它被卡住了。
以前有人遇到过这个问题吗?
【问题讨论】:
【参考方案1】:有时我遇到这种情况,Azkaban 日志没有提供任何失败的原因。我所做的是,我检查了任务的纱线日志,我可以找到失败的原因。
【讨论】:
以上是关于为啥我在 Azkaban 中的 Sqoop 任务在选择列后卡住了?的主要内容,如果未能解决你的问题,请参考以下文章
大数据集群服务安装-mysql,hive,azkaban,sqoop,spark,python