[spark] 序列化错误 object not serializable
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了[spark] 序列化错误 object not serializable相关的知识,希望对你有一定的参考价值。
java.io.NotSerializableException: DmpLogEntry Serialization stack: - object not serializable (class: dmp.entry.DmpLogEntry, value: [email protected]) at org.apache.spark.serializer.SerializationDebugger$.improveException(SerializationDebugger.scala:40) at org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:47) at org.apache.spark.serializer.SerializationStream.writeValue(Serializer.scala:147) at org.apache.spark.storage.DiskBlockObjectWriter.write(DiskBlockObjectWriter.scala:181) at org.apache.spark.util.collection.WritablePartitionedPairCollection$$anon$1.writeNext(WritablePartitionedPairCollection.scala:55) at org.apache.spark.util.collection.ExternalSorter.spill(ExternalSorter.scala:300) at org.apache.spark.util.collection.ExternalSorter.spill(ExternalSorter.scala:90) at org.apache.spark.util.collection.Spillable$class.maybeSpill(Spillable.scala:83) at org.apache.spark.util.collection.ExternalSorter.maybeSpill(ExternalSorter.scala:90) at org.apache.spark.util.collection.ExternalSorter.maybeSpillCollection(ExternalSorter.scala:244) at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:221) at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:73) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:88) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745)
在spark 1.4 jobserver 0.5 环境执行的时候,程序并没有出错.
在spark1.5 jobserver0.6 环境执行的时候出了上面的错误
所以肯定跟环境是有关系的.在spark-defaults.conf中设置serializer 为 org.apache.spark.serializer.KryoSerializer 就好了
spark.serializer org.apache.spark.serializer.KryoSerializer
以上是关于[spark] 序列化错误 object not serializable的主要内容,如果未能解决你的问题,请参考以下文章
HBase Error: connection object not serializable
Py(Spark) udf 给出 PythonException: 'TypeError: 'float' object is not subscriptable
idea运行spark无法导入sql包object sql is not a member of package org.apache.spark
eclispe集成Scalas环境后,导入外部Spark包报错:object apache is not a member of package org