Sqoop 在导出到 Oracle 期间失败

Posted

技术标签:

【中文标题】Sqoop 在导出到 Oracle 期间失败【英文标题】:Sqoop fails during export to Oracle 【发布时间】:2018-08-29 08:04:22 【问题描述】:

我在 Oracle 上有一个表,在 Hive 上有一个相同的表(具有相关数据类型) 我正在尝试使用脚本将配置单元表导出到 Oracle:

sqoop export -D oraoop.disabled=true -Dmapred.job.queue.name=disco --connect jdbc:oracle:thin:@oracle:1521/tns  \
--username someuser  \
--password somepasswd \
--hcatalog-database hive_database  \
--hcatalog-table TABLE_ON_HIVE  \
--table TABLE_ON_ORACLE  \
--num-mappers 5

我得到一个错误:

18/08/29 08:23:10 INFO mapreduce.Job: Job job_1535519043541_1004 running in uber mode : false
18/08/29 08:23:10 INFO mapreduce.Job:  map 0% reduce 0%
18/08/29 08:23:28 INFO mapreduce.Job:  map 100% reduce 0%
18/08/29 08:28:40 INFO mapreduce.Job: Task Id : attempt_1535519043541_1004_m_000000_0, Status : FAILED
AttemptID:attempt_1535519043541_1004_m_000000_0 Timed out after 300 secs
Container killed by the ApplicationMaster.
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143

18/08/29 08:28:41 INFO mapreduce.Job:  map 0% reduce 0%
18/08/29 08:28:57 INFO mapreduce.Job:  map 100% reduce 0%
18/08/29 08:34:09 INFO mapreduce.Job: Task Id : attempt_1535519043541_1004_m_000000_1, Status : FAILED
AttemptID:attempt_1535519043541_1004_m_000000_1 Timed out after 300 secs
18/08/29 08:34:10 INFO mapreduce.Job:  map 0% reduce 0%
18/08/29 08:34:28 INFO mapreduce.Job:  map 100% reduce 0%
18/08/29 08:39:39 INFO mapreduce.Job: Task Id : attempt_1535519043541_1004_m_000000_2, Status : FAILED
AttemptID:attempt_1535519043541_1004_m_000000_2 Timed out after 300 secs
Container killed by the ApplicationMaster.
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143

18/08/29 08:39:40 INFO mapreduce.Job:  map 0% reduce 0%
18/08/29 08:39:56 INFO mapreduce.Job:  map 100% reduce 0%
18/08/29 08:45:11 INFO mapreduce.Job: Job job_1535519043541_1004 failed with state FAILED due to: Task failed task_1535519043541_1004_m_000000
Job failed as tasks failed. failedMaps:1 failedReduces:0

18/08/29 08:45:11 INFO mapreduce.Job: Counters: 9
        Job Counters
                Failed map tasks=4
                Launched map tasks=4
                Other local map tasks=3
                Rack-local map tasks=1
                Total time spent by all maps in occupied slots (ms)=1312647
                Total time spent by all reduces in occupied slots (ms)=0
                Total time spent by all map tasks (ms)=1312647
                Total vcore-seconds taken by all map tasks=1312647
                Total megabyte-seconds taken by all map tasks=5376602112
18/08/29 08:45:11 WARN mapreduce.Counters: Group FileSystemCounters is deprecated. Use org.apache.hadoop.mapreduce.FileSystemCounter instead
18/08/29 08:45:11 INFO mapreduce.ExportJobBase: Transferred 0 bytes in 1,359.6067 seconds (0 bytes/sec)
18/08/29 08:45:11 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
18/08/29 08:45:11 INFO mapreduce.ExportJobBase: Exported 0 records.
18/08/29 08:45:11 ERROR tool.ExportTool: Error during export: Export job failed!

所以我进行了两次尝试,导出失败,没有解释任何原因。 你遇到过像我这样的问题吗? 在哪里可以找到更详细的日志?

帕维尔

【问题讨论】:

您可以在作业跟踪日志中找到的原因。为什么 num-mappers=5 但没有指定 split-column? 我的错误 - sqoop 命令是从以前的脚本中复制的,应该进行编辑。但我想它不会导致错误。 也开启详细模式。这可能会有所帮助。如果没有 split-column 和 num-mappers>1 它不应该工作,因为它不知道如何被映射器分割 【参考方案1】:

请尝试以下命令 sqoop 导出 --connect \ -Dmapred.job.queue.name=迪斯科\ --用户名 sqoop \ --密码 sqoop \ --table emp \ --update-mode 允许插入 \ --update-key id \ --export-dir table_location \ --input-fields-terminated-by 'delimiter'

注意:--update-mode - 我们可以传递两个参数“updateonly” - 来更新记录。如果更新键匹配,这将更新记录。 如果你想做upsert(如果存在UPDATE else INSERT)然后使用“allowinsert”模式。 例子: --update-mode updateonly \ --> 用于更新 --update-mode allowinsert \ --> 用于更新插入

【讨论】:

以上是关于Sqoop 在导出到 Oracle 期间失败的主要内容,如果未能解决你的问题,请参考以下文章

sqoop 从oracle抽数据是 sql怎么写

SQOOP 导出失败

Android 构建失败:在发布期间

包安装期间不应该配置 Oozie/Sqoop jar 位置吗?

sqoop从oracle导数据后是空表

Sqoop Oracle 导出非常慢