Pig 0.7.0 错误 2118:无法在 Hadoop 1.2.1 上创建输入拆分
Posted
技术标签:
【中文标题】Pig 0.7.0 错误 2118:无法在 Hadoop 1.2.1 上创建输入拆分【英文标题】:Pig 0.7.0 ERROR 2118: Unable to create input splits on Hadoop 1.2.1 【发布时间】:2014-02-07 16:08:02 【问题描述】:我从 map reduce 程序获得了输出文件(存储在 HDFS 上)。现在我正在尝试使用 PIG 0.7.0 加载该文件。
我收到以下错误。我已经尝试将此文件复制到本地机器并在本地模式下运行 pig,它工作正常。但我想跳过这一步,让它在 map reduce 模式下工作。
我尝试过的选项:
LOAD 'file://log/part-00000',
LOAD '/log/part-00000',
LOAD 'hdfs:/log/part-00000',
LOAD 'hdfs://localhost:50070/log/part-00000',
hadoop dfs -ls /log/
Warning: $HADOOP_HOME is deprecated.
Found 3 items
-rw-r--r-- 3 supergroup 0 2014-02-07 07:56 /log/_SUCCESS
drwxr-xr-x - supergroup 0 2014-02-07 07:55 /log/_logs
-rw-r--r-- 3 supergroup 10021 2014-02-07 07:56 /log/part-00000
pig(以 mapreduce 模式运行)
grunt> REC = LOAD 'file://log/part-00000' as (CREATE_TMSTP:chararray, MESSAGE_TYPE:chararray, MESSAGE_FROM:chararray, MESSAGE_TEXT:chararray);
grunt> DUMP REC;
Backend error message during job submission
-------------------------------------------
org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Unable to create input splits for: file:///log/part-00000
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:269)
at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:885)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:730)
at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:378)
at org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
at org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
at java.lang.Thread.run(Thread.java:695)
Caused by: org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: file:/log/part-00000
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:224)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigTextInputFormat.listStatus(PigTextInputFormat.java:36)
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:241)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:258)
... 7 more
猪栈跟踪
ERROR 2997: Unable to recreate exception from backend error:org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Unable to create input splits for: file:///log/part-00000
org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for alias REC
at org.apache.pig.PigServer.openIterator(PigServer.java:521)
at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:544)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:241)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:162)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:138)
at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75)
at org.apache.pig.Main.main(Main.java:357)
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2997: Unable to recreate exception from backend error: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Unable to create input splits for: file:///log/part-00000
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:169)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:268)
at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:308)
at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:835)
at org.apache.pig.PigServer.store(PigServer.java:569)
at org.apache.pig.PigServer.openIterator(PigServer.java:504)
... 6 更多
【问题讨论】:
REC = LOAD '/log' as (CREATE_TMSTP:chararray,MESSAGE_TYPE:chararray, MESSAGE_FROM:chararray, MESSAGE_TEXT:chararray)
怎么样
感谢您的快速回复。同样的错误...我将尝试升级 pig 0.12.0 并用我的发现回复大家。引起:org.apache.hadoop.mapreduce.lib.input.InvalidInputException:输入路径不存在:文件:/log
对于在寻找ERROR 1066: Unable to open iterator for alias 时发现此帖子的人,这里是generic solution。
【参考方案1】:
您应该尝试升级到更新版本的 Pig。 0.7.0 已经好几年了。 0.12.0 是当前的稳定版本。
【讨论】:
感谢您的快速回复...我会尝试升级 pig 0.12.0 并用我的发现回复大家。 谢谢你,你的伎俩奏效了。 Pig 0.7 仅适用于 Hadoop 0.20。我尝试使用 Pig 0.12.0,它成功了! :) --> REC = LOAD 'hdfs:/log/part-00000' ... PigWithHadoop 我很高兴新版本对您有用。大多数 Hadoop 生态系统都是特定于版本的。如果您有幸能够运行其中一个发行版(Apache、Cloudera、HortonWorks),您就可以省去确保所有工具都兼容版本的麻烦。以上是关于Pig 0.7.0 错误 2118:无法在 Hadoop 1.2.1 上创建输入拆分的主要内容,如果未能解决你的问题,请参考以下文章
Apache Pig - 错误 2118:对于输入字符串:“4f8:0:a111::add:9898”
Pig 错误:无法找到或加载主类 org.apache.pig.Main