java.lang.IllegalArgumentException:实例化'org.apache.spark.sql.hive.HiveSessionState'时出错:使用spark会话读取csv

Posted

技术标签:

【中文标题】java.lang.IllegalArgumentException:实例化\'org.apache.spark.sql.hive.HiveSessionState\'时出错:使用spark会话读取csv文件时【英文标题】:java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.hive.HiveSessionState': while reading csv file using spark sessionjava.lang.IllegalArgumentException:实例化'org.apache.spark.sql.hive.HiveSessionState'时出错:使用spark会话读取csv文件时 【发布时间】:2019-10-21 05:52:45 【问题描述】:

我在尝试使用 spark session 读取 csv 文件时遇到“org.apache.spark.sql.hive.HiveSessionState”错误。我尝试了所有类似错误的解决方案,但没有一个解决了我的问题。下面是我用来创建火花会话和读取 csv 文件的代码。

import org.apache.spark.SparkConf
import org.apache.spark.sql.DataFrame, SparkSession
import org.slf4j.Logger, LoggerFactory

object LocalTesting 

private val logger: Logger = LoggerFactory.getLogger(getClass)
lazy protected val (ss: SparkSession, local: Boolean, partCount: Int) = 
val conf = new SparkConf()
  .set("hive.exec.orc.split.strategy", "ETL")
  .set("hive.exec.dynamic.partition", "true")
  .set("hive.exec.dynamic.partition.mode", "nonstrict")

val (local, partCount) =
  if (!conf.contains("spark.master")) 
    logger.info("Running in local mode")
    conf.setMaster("local[2]")
    (true, 2)
   else 
    logger.info("Running in cluster mode")
    (false, 16)
  

val ss = SparkSession.builder().enableHiveSupport().config(conf).appName("local job").getOrCreate()
//ss.sparkContext.setLogLevel("WARN")
(ss, local, partCount)

ss.read.format("com.databricks.spark.csv")
  .option("header",true)
  .option("delimiter", ",")
  .load(getFilePath("local_input_data.csv"))

以下是我遇到的错误。

Exception in thread "main" java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.hive.HiveSessionState':
at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$reflect(SparkSession.scala:989)
at org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:116)
at org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:115)
at org.apache.spark.sql.DataFrameReader.<init>(DataFrameReader.scala:549)
at org.apache.spark.sql.SparkSession.read(SparkSession.scala:613)
at com.apple.ist.gbi.Psd2Trait$class.input_data(Psd2Trait.scala:30)
at com.apple.ist.gbi.Psd2DetailApp.input_data$lzycompute(Psd2DetailApp.scala:14)
at com.apple.ist.gbi.Psd2DetailApp.input_data(Psd2DetailApp.scala:14)
at com.apple.ist.gbi.Psd2LocalTesting$.testLocal(Psd2LocalTesting.scala:42)
at com.apple.ist.gbi.Psd2LocalTesting$.main(Psd2LocalTesting.scala:47)
at com.apple.ist.gbi.Psd2LocalTesting.main(Psd2LocalTesting.scala)

 Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$reflect(SparkSession.scala:986)
... 10 more
Caused by: java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.hive.HiveExternalCatalog':
at org.apache.spark.sql.internal.SharedState$.org$apache$spark$sql$internal$SharedState$$reflect(SharedState.scala:169)
at org.apache.spark.sql.internal.SharedState.<init>(SharedState.scala:86)
at org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(SparkSession.scala:101)
at org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(SparkSession.scala:101)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession.sharedState$lzycompute(SparkSession.scala:101)
at org.apache.spark.sql.SparkSession.sharedState(SparkSession.scala:100)
at org.apache.spark.sql.internal.SessionState.<init>(SessionState.scala:157)
at org.apache.spark.sql.hive.HiveSessionState.<init>(HiveSessionState.scala:32)
... 15 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.spark.sql.internal.SharedState$.org$apache$spark$sql$internal$SharedState$$reflect(SharedState.scala:166)
... 23 more

下面是我的 pom 文件:

<properties>
    <scala.compat.version>2.11</scala.compat.version>
    <scala.version>2.11.8</scala.version>
    <spark.version>2.1.2</spark.version>
</properties>

     <!-- https://mvnrepository.com/artifact/org.scala-lang/scala-library -->
    <dependency>
        <groupId>org.scala-lang</groupId>
        <artifactId>scala-library</artifactId>
        <version>$scala.version</version>
    </dependency>
    <!-- https://mvnrepository.com/artifact/org.scala-lang/scala-reflect -->
    <dependency>
        <groupId>org.scala-lang</groupId>
        <artifactId>scala-reflect</artifactId>
        <version>$scala.version</version>
    </dependency>
    <!-- https://mvnrepository.com/artifact/org.scala-lang.modules/scala-xml -->
    <dependency>
        <groupId>org.scala-lang.modules</groupId>
        <artifactId>scala-xml_2.11</artifactId>
        <version>1.0.6</version>
    </dependency>
    <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-core -->
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-core_$scala.compat.version</artifactId>
        <version>$spark.version</version>
    </dependency>
    <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql -->
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-sql_$scala.compat.version</artifactId>
        <version>$spark.version</version>
    </dependency>
    <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-hive -->
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-hive_$scala.compat.version</artifactId>
        <version>$spark.version</version>
    </dependency>
    <!-- https://mvnrepository.com/artifact/com.databricks/spark-csv -->
    <!--dependency>
        <groupId>com.databricks</groupId>
        <artifactId>spark-csv_$scala.compat.version</artifactId>
        <version>1.5.0</version>
    </dependency-->
    <!-- https://mvnrepository.com/artifact/org.slf4j/slf4j-api -->
    <dependency>
        <groupId>org.slf4j</groupId>
        <artifactId>slf4j-api</artifactId>
        <version>1.7.25</version>
    </dependency>
    <!-- https://mvnrepository.com/artifact/org.slf4j/slf4j-simple -->
    <dependency>
        <groupId>org.slf4j</groupId>
        <artifactId>slf4j-simple</artifactId>
        <version>1.7.25</version>
    </dependency>
    <!-- https://mvnrepository.com/artifact/org.slf4j/slf4j-log4j12 -->
    <dependency>
        <groupId>org.slf4j</groupId>
        <artifactId>slf4j-log4j12</artifactId>
        <version>1.7.25</version>
    </dependency>
    <!-- https://mvnrepository.com/artifact/commons-configuration/commons-configuration -->
    <dependency>
        <groupId>commons-configuration</groupId>
        <artifactId>commons-configuration</artifactId>
        <version>1.6</version>
    </dependency>
    <!-- https://mvnrepository.com/artifact/commons-lang/commons-lang -->
    <dependency>
        <groupId>commons-lang</groupId>
        <artifactId>commons-lang</artifactId>
        <version>2.6</version>
    </dependency>
    <!-- https://mvnrepository.com/artifact/com.fasterxml.jackson.core/jackson-core -->
    <dependency>
        <groupId>com.fasterxml.jackson.core</groupId>
        <artifactId>jackson-core</artifactId>
        <version>2.6.5</version>
    </dependency>
    <!-- https://mvnrepository.com/artifact/com.fasterxml.jackson.core/jackson-annotations -->
    <dependency>
        <groupId>com.fasterxml.jackson.core</groupId>
        <artifactId>jackson-annotations</artifactId>
        <version>2.6.5</version>
    </dependency>

我尝试在创建 spark 会话时添加 config("spark.sql.warehouse.dir", "./spark-warehouse") 但问题仍未解决。

val ss = SparkSession.builder().enableHiveSupport().config(conf)
  .config("spark.sql.warehouse.dir", "./spark-warehouse")
  .appName("local payments job").getOrCreate()

非常感谢任何帮助。

【问题讨论】:

【参考方案1】:

上面的相同代码适用于我的集群。 似乎是集群配置问题

【讨论】:

我在本地的 intelliJ 中运行这段代码来调试一些问题。 我在本地使用 spark-submit 运行

以上是关于java.lang.IllegalArgumentException:实例化'org.apache.spark.sql.hive.HiveSessionState'时出错:使用spark会话读取csv的主要内容,如果未能解决你的问题,请参考以下文章

IllegalArgumentException:此 NavController 未知导航目的地 xxx