Spark数据集和scala.ScalaReflectionException:类型V不是类

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Spark数据集和scala.ScalaReflectionException:类型V不是类相关的知识,希望对你有一定的参考价值。

我有以下课程:

case class S1(value: String, ws: Map[Int, String])
case class S2(value: String, ws: Map[Int, String], dep: BS)

如上所示,这两个有一个不同的领域,即BS

下面的代码工作正常。

sparkSQL.createDataset(Seq(S1("heloo", Map(0 -> "0")))).foreach(x => println(x))

下面的代码也很好,它本身就是BS类。

sparkSQL.createDataset(Seq(BS(List(0), List(Edge(0, 1, DepRelation("0-->1", "", "")))))).foreach(x => println(x))

现在,如果我使用基本上是S2S1类的BS,我会得到一个运行时错误消息:

sparkSQL.createDataset(Seq(S2("heloo", Map(0 -> "0"), BS(List(0), List(Edge(0, 1, DepRelation("0-->1", "", ""))))))).foreach(x => println(x))

Exception in thread "main" scala.ScalaReflectionException: type V is not a class
    at scala.reflect.api.Symbols$SymbolApi$class.asClass(Symbols.scala:275)
    at scala.reflect.internal.Symbols$SymbolContextApiImpl.asClass(Symbols.scala:84)
    at org.apache.spark.sql.catalyst.ScalaReflection$.getClassFromType(ScalaReflection.scala:689)
    at org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$org$apache$spark$sql$catalyst$ScalaReflection$$dataTypeFor$1.apply(ScalaReflection.scala:84)
    at org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$org$apache$spark$sql$catalyst$ScalaReflection$$dataTypeFor$1.apply(ScalaReflection.scala:66)
    at scala.reflect.internal.tpe.TypeConstraints$UndoLog.undo(TypeConstraints.scala:56)
    at org.apache.spark.sql.catalyst.ScalaReflection$class.cleanUpReflectionObjects(ScalaReflection.scala:809)
    at org.apache.spark.sql.catalyst.ScalaReflection$.cleanUpReflectionObjects(ScalaReflection.scala:39)
    at org.apache.spark.sql.catalyst.ScalaReflection$.org$apache$spark$sql$catalyst$ScalaReflection$$dataTypeFor(ScalaReflection.scala:65)
    at org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$org$apache$spark$sql$catalyst$ScalaReflection$$serializerFor$1.toCatalystArray$1(ScalaReflection.scala:458)
    at org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$org$apache$spark$sql$catalyst$ScalaReflection$$serializerFor$1.apply(ScalaReflection.scala:503)
    at org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$org$apache$spark$sql$catalyst$ScalaReflection$$serializerFor$1.apply(ScalaReflection.scala:455)
    at scala.reflect.internal.tpe.TypeConstraints$UndoLog.undo(TypeConstraints.scala:56)
    at org.apache.spark.sql.catalyst.ScalaReflection$class.cleanUpReflectionObjects(ScalaReflection.scala:809)
    at org.apache.spark.sql.catalyst.ScalaReflection$.cleanUpReflectionObjects(ScalaReflection.scala:39)
    at org.apache.spark.sql.catalyst.ScalaReflection$.org$apache$spark$sql$catalyst$ScalaReflection$$serializerFor(ScalaReflection.scala:455)
    at org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$org$apache$spark$sql$catalyst$ScalaReflection$$serializerFor$1$$anonfun$10.apply(ScalaReflection.scala:626)
    at org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$org$apache$spark$sql$catalyst$ScalaReflection$$serializerFor$1$$anonfun$10.apply(ScalaReflection.scala:614)
    at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
    at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
    at scala.collection.immutable.List.foreach(List.scala:392)
    at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241)
    at scala.collection.immutable.List.flatMap(List.scala:355)
    at org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$org$apache$spark$sql$catalyst$ScalaReflection$$serializerFor$1.apply(ScalaReflection.scala:614)
    at org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$org$apache$spark$sql$catalyst$ScalaReflection$$serializerFor$1.apply(ScalaReflection.scala:455)
    at scala.reflect.internal.tpe.TypeConstraints$UndoLog.undo(TypeConstraints.scala:56)
    at org.apache.spark.sql.catalyst.ScalaReflection$class.cleanUpReflectionObjects(ScalaReflection.scala:809)
    at org.apache.spark.sql.catalyst.ScalaReflection$.cleanUpReflectionObjects(ScalaReflection.scala:39)
    at org.apache.spark.sql.catalyst.ScalaReflection$.org$apache$spark$sql$catalyst$ScalaReflection$$serializerFor(ScalaReflection.scala:455)
    at org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$org$apache$spark$sql$catalyst$ScalaReflection$$serializerFor$1$$anonfun$10.apply(ScalaReflection.scala:626)
    at org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$org$apache$spark$sql$catalyst$ScalaReflection$$serializerFor$1$$anonfun$10.apply(ScalaReflection.scala:614)
    at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
    at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
    at scala.collection.immutable.List.foreach(List.scala:392)
    at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241)
    at scala.collection.immutable.List.flatMap(List.scala:355)
    at org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$org$apache$spark$sql$catalyst$ScalaReflection$$serializerFor$1.apply(ScalaReflection.scala:614)
    at org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$org$apache$spark$sql$catalyst$ScalaReflection$$serializerFor$1.apply(ScalaReflection.scala:455)
    at scala.reflect.internal.tpe.TypeConstraints$UndoLog.undo(TypeConstraints.scala:56)
    at org.apache.spark.sql.catalyst.ScalaReflection$class.cleanUpReflectionObjects(ScalaReflection.scala:809)
    at org.apache.spark.sql.catalyst.ScalaReflection$.cleanUpReflectionObjects(ScalaReflection.scala:39)
    at org.apache.spark.sql.catalyst.ScalaReflection$.org$apache$spark$sql$catalyst$ScalaReflection$$serializerFor(ScalaReflection.scala:455)
    at org.apache.spark.sql.catalyst.ScalaReflection$.serializerFor(ScalaReflection.scala:444)
    at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$.apply(ExpressionEncoder.scala:71)
    at org.apache.spark.sql.Encoders$.product(Encoders.scala:275)
    at org.apache.spark.sql.LowPrioritySQLImplicits$class.newProductEncoder(SQLImplicits.scala:233)
    at org.apache.spark.sql.SQLImplicits.newProductEncoder(SQLImplicits.scala:33)

这是我的环境:

val conf = new SparkConf()
    .setAppName("test")
    .setMaster("local[*]")

val spark = new SparkContext(conf)
val sparkSQL = new SQLContext(spark)

import sparkSQL.implicits._

- 评论中每个请求的编辑1-- Edge和DepRel定义

case class Edge[V,E](from:V, to:V, label:E)
case class DepRelation(vl:String)
答案

从您的错误堆栈跟踪中可以明显看出,它无法满足以下要求。

Exception in thread "main" scala.ScalaReflectionException: type V is not a class

所以,这里的主要难点是Type V,您已经用它来定义以下case类。

case class Edge[V,E](from:V, to:V, label:E)

什么类型的V(和E?)。例如,它们是String / Int / Double吗?你能告诉我这里的代码吗? (因为这里是你问题的答案)

如果您已经确定了那些V和E的类型信息,那么尝试重写Edge的case类定义,如下所示(通过推断实际类型)。对于例如

case class Edge[V: Int,E: Int](from:V, to:V, label:E)

它可能会解决您的问题,或者您可能需要进一步查看您的类型V.

希望这可以帮助。

以上是关于Spark数据集和scala.ScalaReflectionException:类型V不是类的主要内容,如果未能解决你的问题,请参考以下文章

Spark

Spark入门

初识Spark

大数据实习之spark

Spark入门(1-2)

Spark 数据集 - 强类型