spark踩坑——dataframe写入hbase连接异常

Posted xing901022

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了spark踩坑——dataframe写入hbase连接异常相关的知识,希望对你有一定的参考价值。

最近测试环境基于shc[https://github.com/hortonworks-spark/shc]的hbase-connector总是异常连接不到zookeeper,看下报错日志:

18/06/20 10:45:02 INFO RecoverableZooKeeper: Process identifier=hconnection-0x5175ab05 connecting to ZooKeeper ensemble=localhost:2181
18/06/20 10:45:02 INFO RecoverableZooKeeper: Process identifier=hconnection-0x6399f976 connecting to ZooKeeper ensemble=localhost:2181
18/06/20 10:45:02 INFO ZooKeeper: Client environment:zookeeper.version=3.4.6-1569965, built on 02/20/2014 09:09 GMT
18/06/20 10:45:02 INFO ZooKeeper: Client environment:host.name=hnode8
18/06/20 10:45:02 INFO ZooKeeper: Client environment:java.version=1.8.0_66
18/06/20 10:45:02 INFO ZooKeeper: Client environment:java.vendor=Oracle Corporation
18/06/20 10:45:02 INFO ZooKeeper: Client environment:java.home=/usr/local/jdk1.8.0_66/jre
...
18/06/20 10:45:02 INFO ZooKeeper: Client environment:java.io.tmpdir=/data5/yarn/nm/usercache/hdfs/appcache/application_1527863132022_18903/container_e35_1527863132022_18903_01_000003/tmp
18/06/20 10:45:02 INFO ZooKeeper: Client environment:java.compiler=<NA>
18/06/20 10:45:02 INFO ZooKeeper: Client environment:os.name=Linux
18/06/20 10:45:02 INFO ZooKeeper: Client environment:os.arch=amd64
18/06/20 10:45:02 INFO ZooKeeper: Client environment:os.version=2.6.32-696.3.1.el6.x86_64
18/06/20 10:45:02 INFO ZooKeeper: Client environment:user.name=yarn
18/06/20 10:45:02 INFO ZooKeeper: Client environment:user.home=/var/lib/hadoop-yarn
18/06/20 10:45:02 INFO ZooKeeper: Client environment:user.dir=/data5/yarn/nm/usercache/hdfs/appcache/application_1527863132022_18903/container_e35_1527863132022_18903_01_000003
18/06/20 10:45:02 INFO ZooKeeper: Initiating client connection, connectString=localhost:2181 sessionTimeout=90000 watcher=hconnection-0x5175ab050x0, quorum=localhost:2181, baseZNode=/hbase
18/06/20 10:45:02 INFO ZooKeeper: Initiating client connection, connectString=localhost:2181 sessionTimeout=90000 watcher=hconnection-0x6399f9760x0, quorum=localhost:2181, baseZNode=/hbase
18/06/20 10:45:02 INFO ClientCnxn: Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
18/06/20 10:45:02 INFO ClientCnxn: Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
18/06/20 10:45:02 WARN ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
    at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
    at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
18/06/20 10:45:02 WARN ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
    at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
    at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
18/06/20 10:45:03 INFO ClientCnxn: Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
18/06/20 10:45:03 INFO ClientCnxn: Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
18/06/20 10:45:03 WARN ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
    at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
    at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
18/06/20 10:45:03 WARN ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
    at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
    at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
18/06/20 10:45:04 INFO ClientCnxn: Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
18/06/20 10:45:04 INFO ClientCnxn: Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)

可以观察到hbase-connector莫名其妙连接的是localhost:2181,检查所有的配置文件都没有错。同样的代码线上就正常运行,对比之下发现是缺少了hbase-site.xml的配置文件。

查找shc的issue发现已经有人提出这种问题了:
https://github.com/hortonworks-spark/shc/issues/227

大意是说,默认会连接localhost:2181,如果需要连接远程的hbase,只需要拷贝hbase-site.xml到类目录下即可。由于我这边还用了nameservice,因此hdfs-site.xml等配置文件也需要一同打包到Jar中。

技术分享图片

以上是关于spark踩坑——dataframe写入hbase连接异常的主要内容,如果未能解决你的问题,请参考以下文章

[转]Spark 踩坑记:数据库(Hbase+Mysql)

Spark踩坑记——数据库(Hbase+Mysql)转

Spark-on-Hbase:通过Spark的DataFrame访问Hbase表

Spark DataFrame便捷整合HBase

spark将数据写入hbase以及从hbase读取数据

使用 phoenix 连接器将 Spark 数据帧写入 Hbase