HBase Error: connection object not serializable

Posted 3bugs

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了HBase Error: connection object not serializable相关的知识,希望对你有一定的参考价值。

HBase Error: connection object not serializable

想在spark driver程序中连接HBase数据库,并将数据插入到HBase,但是在spark集群提交运行过程中遇到错误:connection object not serializable

详细的错误:

Exception in thread "main" java.io.NotSerializableException: DStream checkpointing has been enabled but the DStreams with their functions are not serializable
com.sae.model.HbaseHelper
Serialization stack:
        - object not serializable (class: com.sae.model.HbaseHelper, value: [email protected])
        - field (class: com.sae.demo.KafkaStreamingTest$$anonfun$main$1, name: hbHelper$1, type: class com.sae.model.HbaseHelper)
        - object (class com.sae.demo.KafkaStreamingTest$$anonfun$main$1, <function1>)
        - field (class: org.apache.spark.streaming.dstream.DStream$$anonfun$foreachRDD$1$$anonfun$apply$mcV$sp$3, name: cleanedF$1, type: interface scala.Function1)
        - object (class org.apache.spark.streaming.dstream.DStream$$anonfun$foreachRDD$1$$anonfun$apply$mcV$sp$3, <function2>)
        - writeObject data (class: org.apache.spark.streaming.dstream.DStream)
        - object (class org.apache.spark.streaming.dstream.ForEachDStream, [email protected])
        - element of array (index: 0)
        - array (class [Ljava.lang.Object;, size 16)
        - field (class: scala.collection.mutable.ArrayBuffer, name: array, type: class [Ljava.lang.Object;)
        - object (class scala.collection.mutable.ArrayBuffer, ArrayBuffer([email protected], [email protected]))
        - writeObject data (class: org.apache.spark.streaming.dstream.DStreamCheckpointData)
        - object (class org.apache.spark.streaming.dstream.DStreamCheckpointData, [
0 checkpoint files

])
        - writeObject data (class: org.apache.spark.streaming.dstream.DStream)
        - object (class org.apache.spark.streaming.kafka.KafkaInputDStream, [email protected])
        - element of array (index: 0)
        - array (class [Ljava.lang.Object;, size 16)
        - field (class: scala.collection.mutable.ArrayBuffer, name: array, type: class [Ljava.lang.Object;)
        - object (class scala.collection.mutable.ArrayBuffer, ArrayBuffer([email protected]))
        - writeObject data (class: org.apache.spark.streaming.DStreamGraph)
        - object (class org.apache.spark.streaming.DStreamGraph, [email protected])
        - field (class: org.apache.spark.streaming.Checkpoint, name: graph, type: class org.apache.spark.streaming.DStreamGraph)
        - object (class org.apache.spark.streaming.Checkpoint, [email protected])
        at org.apache.spark.streaming.StreamingContext.validate(StreamingContext.scala:557)
        at org.apache.spark.streaming.StreamingContext.liftedTree1$1(StreamingContext.scala:601)
        at org.apache.spark.streaming.StreamingContext.start(StreamingContext.scala:600)
        at com.sae.demo.KafkaStreamingTest$.main(StreamingDataFromKafka.scala:225)
        at com.sae.demo.KafkaStreamingTest.main(StreamingDataFromKafka.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:497)
        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
        at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

 

解决办法:

参考官方文档:传送门

应该把打开数据库连接的代码放到foreachPartition内部,如:

dstream.foreachRDD { rdd =>
  rdd.foreachPartition { partitionOfRecords =>
    val connection = createNewConnection()
    partitionOfRecords.foreach(record => connection.send(record))
    connection.close()
  }
}

 

以上是关于HBase Error: connection object not serializable的主要内容,如果未能解决你的问题,请参考以下文章

连接HBase的正确姿势

HBase Connection Pooling

ERROR 1130: Host '192.168.1.3' is not allowed to connect to this MySQL ERROR 1062 (23000): D

两主机搭建MySQL主从复制后,show slave status显示:Last_IO_Error: error connecting to master ……

四、HBase客户端

HBase 报错 ERROR: org.apache.hadoop.hbase.PleaseHoldException: Master is initializing