SQOOP 导出失败

Posted

技术标签:

【中文标题】SQOOP 导出失败【英文标题】:SQOOP Export failure 【发布时间】:2017-03-29 08:24:00 【问题描述】:

我正在尝试将表从 HDFS 导出到 SQOOP,但出现 java 异常。 我使用的查询如下:

sqoop export --connect jdbc:mysql://172.31.54.174/Database --driver com.mysql.jdbc.Driver --username user --password userpassword --table accounts --export-dir /user/pri/accounts

执行此查询时出现以下错误:

17/03/29 07:54:26 INFO mapreduce.Job:  map 0% reduce 0%
17/03/29 07:54:30 INFO mapreduce.Job: Task Id : attempt_1489328678238_4886_m_000002_0, Status : FAILED
Error: java.io.IOException: Can't export data, please check failed map task logs
        at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
        at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
        at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
Caused by: java.lang.RuntimeException: Can't parse input data: '\N'
        at accounts.__loadFromFields(accounts.java:691)
        at accounts.parse(accounts.java:584)
        at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83)
        ... 10 more
Caused by: java.lang.IllegalArgumentException: Timestamp format must be yyyy-mm-dd hh:mm:ss[.fffffffff]
        at java.sql.Timestamp.valueOf(Timestamp.java:204)
        at accounts.__loadFromFields(accounts.java:643)
        ... 12 more

我要导出的文件包含以下数据:

1,2008-10-23 16:05:05.0,\N,Donald,Becton,2275 Washburn Street,Oakland,CA,94660,5100032418,2014-03-18 13:29:47.0,2014-03-18 13:29:47.0

2,2008-11-12 03:00:01.0,\N,Donna,Jones,3885 Elliott Street,San Francisco,CA,94171,4150835799,2014-03-18 13:29:47.0,2014-03-18 13:29:47.0 

我也创建了accounts表,其结构如下:

+----------------+-------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +----------------+-------------+------+-----+---------+-------+ | acct_num | varchar(20) | NO | PRI | | | | acct_create_dt | datetime | NO | | NULL | | | acc_close_dt | datetime | YES | | NULL | | | first_name | varchar(20) | NO | | NULL | | | last_name | varchar(20) | NO | | NULL | | | address | varchar(30) | NO | | NULL | | | city | varchar(20) | NO | | NULL | | | state | varchar(20) | NO | | NULL | | | zipcode | varchar(20) | NO | | NULL | | | phone_number | varchar(20) | YES | | NULL | | | created | datetime | NO | | NULL | | | modified | datetime | NO | | NULL | | +----------------+-------------+------+-----+---------+-------+

我还附上了错误截图。

【问题讨论】:

【参考方案1】:

正如您从日志中看到的那样,“\N”就像转义字符,所以它不适合 varchar。我不明白您为什么要添加相同的字符。还指出了时间戳格式问题。如果您用于主键的任何列本身重复,请检查现有数据。

【讨论】:

【参考方案2】:

在您的 sqoop 导出命令中添加 --input-null-string '\\N' --input-null-non-string '\\N'

【讨论】:

以上是关于SQOOP 导出失败的主要内容,如果未能解决你的问题,请参考以下文章

Sqoop 导出失败。无法解析输入数据:'<data>'

Sqoop 在导出到 Oracle 期间失败

Sqoop 导入映射器失败,但 sqoop 作业显示正在运行

Sqoop 导入安全 hbase 失败

oozie sqoop 操作导入失败

Sqoop - 使用 Oozie 导入 Hive 失败