线程“main”中的异常org.apache.Hadoop.mapred.InvalidJobConfException:未在JobConf中设置输出目录
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了线程“main”中的异常org.apache.Hadoop.mapred.InvalidJobConfException:未在JobConf中设置输出目录相关的知识,希望对你有一定的参考价值。
我是新的Hadoop用户。我的程序是跳过mapreduce中的错误记录数据。我没有跳过坏数据所以首先,我不是试图跳过数据,我想找到发生的错误。所以,我添加mycustomrunjob()来知道为什么我不能跳过坏记录。目前,我删除了跳过编码行。我运行这个程序时遇到问题,虽然我已经设置了输出文件路径:
import java.io.IOException;
import org.apache.hadoop.conf.* ;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.* ;
import org.apache.hadoop.mapred.* ;
import org.apache.hadoop.mapred.lib.* ;
public class SkipData
{
public static class MapClass extends MapReduceBase implements Mapper<LongWritable, Text, Text, LongWritable>
{
private final static LongWritable one = new LongWritable(1);
private Text word = new Text("totalcount");
public void map(LongWritable key, Text value, OutputCollector<Text, LongWritable> output, Reporter reporter) throws IOException
{
String line = value.toString();
if (line.equals("skiptext"))
throw new RuntimeException("Found skiptext") ;
output.collect(word, one);
}
}
public static RunningJob myCustomRunJob(JobConf job) throws Exception {
JobClient jc = new JobClient(job);
RunningJob rj = jc.submitJob(job);
if (!jc.monitorAndPrintJob(job, rj)) {
throw new IOException("Job failed with info: " + rj.getFailureInfo());
}
return rj;
}
public static void main(String[] args) throws Exception
{
System.setProperty("hadoop.home.dir", "/");
Configuration config = new Configuration() ;
JobConf conf = new JobConf(config, SkipData.class);
RunningJob result=myCustomRunJob(conf);
conf.setJobName("SkipData");
conf.setOutputKeyClass(Text.class);
conf.setOutputValueClass(LongWritable.class);
conf.setMapperClass(MapClass.class);
conf.setCombinerClass(LongSumReducer.class);
conf.setReducerClass(LongSumReducer.class);
FileInputFormat.setInputPaths(conf,args[0]) ;
FileOutputFormat.setOutputPath(conf, new Path(args[1])) ;
JobClient.runJob(conf);
}
}
我试图多次完成此错误。我使用的是旧API。如何解决这个问题?
18/02/28 11:05:28 DEBUG security.UserGroupInformation: PrivilegedActionException as:saung (auth:SIMPLE) cause:org.apache.hadoop.mapred.InvalidJobConfException: Output directory not set in JobConf.
18/02/28 11:05:28 DEBUG security.UserGroupInformation: PrivilegedActionException as:saung (auth:SIMPLE) cause:org.apache.hadoop.mapred.InvalidJobConfException: Output directory not set in JobConf.
Exception in thread "main" org.apache.hadoop.mapred.InvalidJobConfException: Output directory not set in JobConf.
at org.apache.hadoop.mapred.FileOutputFormat.checkOutputSpecs(FileOutputFormat.ja va:117)
at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:268)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:139)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:575)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:570)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:570)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:561)
at mapredpack.SkipData.myCustomRunJob(SkipData.java:90)
at mapredpack.SkipData.main(SkipData.java:140)
你试图通过调用两次来运行这项工作
RunningJob result=myCustomRunJob(conf);
如此早,您的工作将失败,因为在该阶段没有设置任何配置。我会删除该行(和myCustomRunJob(JobConf job)
方法)。最底层的JobClient.runJob(conf)
将处理这项工作。
代码中有两个问题。
- 您是第一次调用该作业而不设置任何输入/输出路径。
- 此外,您正在尝试重新提交作业,该作业必然会失败(因为每个MR作业都需要一个新的输出目录)。
像这样改变你的主要方法:
public static void main(String[] args) throws Exception
{
System.setProperty("hadoop.home.dir", "/");
Configuration config = new Configuration() ;
JobConf conf = new JobConf(config, SkipData.class);
conf.setJobName("SkipData");
conf.setOutputKeyClass(Text.class);
conf.setOutputValueClass(LongWritable.class);
conf.setMapperClass(MapClass.class);
conf.setCombinerClass(LongSumReducer.class);
conf.setReducerClass(LongSumReducer.class);
FileInputFormat.setInputPaths(conf,args[0]) ;
FileOutputFormat.setOutputPath(conf, new Path(args[1])) ;
RunningJob result=myCustomRunJob(conf);
}
以上是关于线程“main”中的异常org.apache.Hadoop.mapred.InvalidJobConfException:未在JobConf中设置输出目录的主要内容,如果未能解决你的问题,请参考以下文章
我的代码上的线程“main”java.util.NoSuchElementException 中的异常?
如何修复运行时错误-线程“main”java.util.NoSuchElementException中的异常
HTTPClient 示例 - 线程“main”中的异常 java.lang.NoSuchFieldError: INSTANCE
为啥我在代码中的线程“main”java.lang.StringIndexOutOfBoundsException 错误中收到异常?
线程“main”中的 Java 异常 java.lang.StringIndexOutOfBoundsException 错误
线程“main”中的异常 java.lang.ExceptionInInitializerError (Clojure)