sqoop 导入错误 - 文件不存在:

Posted

技术标签:

【中文标题】sqoop 导入错误 - 文件不存在:【英文标题】:sqoop import eror - File does not exist: 【发布时间】:2015-11-24 11:52:15 【问题描述】:

我正在尝试使用 Sqoop 将数据从 mysql 导入 HDFS。但我收到以下错误。

如何解决?

命令:

sqoop import --connect jdbc:mysql://localhost/testDB --username root --password password --table student --m 1

错误:

 ERROR tool.ImportTool: Encountered IOException running import job: java.io.FileNotFoundException: File does not exist: hdfs://localhost:54310/usr/lib/sqoop/lib/parquet-format-2.0.0.jar
    at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1122)
    at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1114)
    at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
    at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1114)
    at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288)
    at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:224)
    at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:93)
    at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57)
    at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:269)
    at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:390)
    at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:483)
    at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1296)
    at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1293)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
    at org.apache.hadoop.mapreduce.Job.submit(Job.java:1293)
    at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1314)
    at org.apache.sqoop.mapreduce.ImportJobBase.doSubmitJob(ImportJobBase.java:196)
    at org.apache.sqoop.mapreduce.ImportJobBase.runJob(ImportJobBase.java:169)
    at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:266)
    at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:673)
    at org.apache.sqoop.manager.MySQLManager.importTable(MySQLManager.java:118)
    at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:497)
    at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605)
    at org.apache.sqoop.Sqoop.run(Sqoop.java:143)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
    at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)
    at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)
    at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)

Hadoop 版本:2.6.0

Sqoop 版本:1.4.6

【问题讨论】:

【参考方案1】:

您可以通过将 /usr/local/sqoop/lib 中的所有 jars 添加到 HDFS 中来解决此问题:

首先在HDFS中创建一个目录:

hdfs dfs -mkdir -p /usr/lib/sqoop/lib/

然后将所有库放入HDFS

hdfs dfs -put /usr/lib/sqoop/lib/* /usr/lib/sqoop/lib/

然后检查 jars 是否存在于 HDFS 中:

hdfs dfs -ls /usr/lib/sqoop/lib

最后使用 sqoop 导入数据:

sqoop import --connect jdbc:mysql://localhost/testDB --username root --password password --table student --m 1

【讨论】:

以上是关于sqoop 导入错误 - 文件不存在:的主要内容,如果未能解决你的问题,请参考以下文章

Sqoop Hcatalog 导入作业已完成,但表中不存在数据

sqoop连接MySQL导入hdfs报错

大数据:Sqoop-导出错误

Sqoop 的语法导入数据库中存在的 100 个表中的 5 个 - 不要使用排除关键字?

sqoop从mysql导入到hive为啥0变成null

Sqoop 将hdfs上的文件导入到oracle中,关于date类型的问题