[spark] 序列化错误 object not serializable

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了[spark] 序列化错误 object not serializable相关的知识,希望对你有一定的参考价值。

java.io.NotSerializableException: DmpLogEntry
Serialization stack:
- object not serializable (class: dmp.entry.DmpLogEntry, value: [email protected])
at org.apache.spark.serializer.SerializationDebugger$.improveException(SerializationDebugger.scala:40)
at org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:47)
at org.apache.spark.serializer.SerializationStream.writeValue(Serializer.scala:147)
at org.apache.spark.storage.DiskBlockObjectWriter.write(DiskBlockObjectWriter.scala:181)
at org.apache.spark.util.collection.WritablePartitionedPairCollection$$anon$1.writeNext(WritablePartitionedPairCollection.scala:55)
at org.apache.spark.util.collection.ExternalSorter.spill(ExternalSorter.scala:300)
at org.apache.spark.util.collection.ExternalSorter.spill(ExternalSorter.scala:90)
at org.apache.spark.util.collection.Spillable$class.maybeSpill(Spillable.scala:83)
at org.apache.spark.util.collection.ExternalSorter.maybeSpill(ExternalSorter.scala:90)
at org.apache.spark.util.collection.ExternalSorter.maybeSpillCollection(ExternalSorter.scala:244)
at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:221)
at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:73)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:88)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)

在spark 1.4  jobserver 0.5 环境执行的时候,程序并没有出错.

在spark1.5 jobserver0.6 环境执行的时候出了上面的错误

 

所以肯定跟环境是有关系的.在spark-defaults.conf中设置serializer 为 org.apache.spark.serializer.KryoSerializer 就好了

spark.serializer                 org.apache.spark.serializer.KryoSerializer

 

以上是关于[spark] 序列化错误 object not serializable的主要内容,如果未能解决你的问题,请参考以下文章

HBase Error: connection object not serializable

Py(Spark) udf 给出 PythonException: 'TypeError: 'float' object is not subscriptable

idea运行spark无法导入sql包object sql is not a member of package org.apache.spark

eclispe集成Scalas环境后,导入外部Spark包报错:object apache is not a member of package org

Kryo序列化:Class Not Found的可能原因

Spark-shell报error: not found错误