idea运行spark项目报错:org.apache.hadoop.io.nativeio.NativeIO$Windows.createDirectoryWithMode0

Posted weixin_43886470

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了idea运行spark项目报错:org.apache.hadoop.io.nativeio.NativeIO$Windows.createDirectoryWithMode0相关的知识,希望对你有一定的参考价值。

使用idea运行spark项目wordcount出现此类错误:org.apache.hadoop.io.nativeio.NativeIO$Windows.createDirectoryWithMode0

解决方法如下:

1.确保安装了hadoop。我使用的spark和hadoop版本都是3.3.2,配置环境变量HADOOP_HOME=D:\\hadoop-3.3.2,还要注意Path变量要加入%HADOOP_HOME%\\bin。

2.还要安装hadoop windows的winutils集成包,注意这里最好是下载与hadoop版本一致的winutils(我这里使用的是winutils的3.3.1版本,也是可行的)

3.将winutils bin目录下所有文件都拷贝到hadoop的bin文件夹中,并把hadoop.dll拷贝到C:\\Windows\\System32

4.如果还未能解决该错误,还可以在idea中设置run的vm options,即添加Djava.library.path="D:\\hadoop-3.3.2"(亲测有效!!!!我在其他教程中都没办法解决该错误,直到看到这一解决方法才运行成功!!!!!)

 

运行spark官方的graphx 示例 ComprehensiveExample.scala报错解决

运行spark官方的graphx 示例 ComprehensiveExample.scala报错解决

在Idea中,直接运行ComprehensiveExample.scala,报需要指定master异常。
修改源码:指定master为local模式,

 val spark = SparkSession
      .builder
      .appName(s"${this.getClass.getSimpleName}").master("local[2]")
      .getOrCreate()

继续运行,报如下错误:

Exception in thread "main" java.lang.IllegalAccessError: tried to access method com.google.common.base.Stopwatch.<init>()V from class org.apache.hadoop.mapred.FileInputFormat
    at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:312)
    at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:199)
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:252)
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:250)
    at scala.Option.getOrElse(Option.scala:121)
    at org.apache.spark.rdd.RDD.partitions(RDD.scala:250)
    at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:252)
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:250)
    at scala.Option.getOrElse(Option.scala:121)
    at org.apache.spark.rdd.RDD.partitions(RDD.scala:250)
    at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:252)
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:250)
    at scala.Option.getOrElse(Option.scala:121)
    at org.apache.spark.rdd.RDD.partitions(RDD.scala:250)
    at org.apache.spark.SparkContext.runJob(SparkContext.scala:2094)
    at org.apache.spark.rdd.RDD.count(RDD.scala:1158)
    at org.apache.spark.graphx.GraphLoader$.edgeListFile(GraphLoader.scala:94)
    at graphx.ComprehensiveExample$.main(ComprehensiveExample.scala:53)
    at graphx.ComprehensiveExample.main(ComprehensiveExample.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at com.intellij.rt.execution.application.AppMain.main(AppMain.java:144)

com.google.common.base.Stopwatch 在guava jar包中,查了相关资料,说可能是guava包的版本问题,
我的测试程序使用的版本是18.0,而spark官方源码使用的是14.0.1,spark 2.2.1 使用依赖的hadoop 版本是2.6.5。

        <dependency>
            <groupId>com.google.guava</groupId>
            <artifactId>guava</artifactId>
            <version>14.0.1</version>
        </dependency>

修改后重新运行spark任务,问题解决。

以上是关于idea运行spark项目报错:org.apache.hadoop.io.nativeio.NativeIO$Windows.createDirectoryWithMode0的主要内容,如果未能解决你的问题,请参考以下文章

idea中导包出现import org.apach.*,提交代码老出现冲突,不想使用.*的设置

替换默认导入的库 Spark 的类路径

spark本地项目报错:Could not locate executable nullinwinutils.exe in the Hadoop binaries.

spark本地项目报错:Could not locate executable nullinwinutils.exe in the Hadoop binaries.

spark本地项目报错:Could not locate executable nullinwinutils.exe in the Hadoop binaries.

spark本地项目报错:Could not locate executable nullinwinutils.exe in the Hadoop binaries.