无法使用 jdbc 通过 Hive 连接到 HBase

Posted

技术标签:

【中文标题】无法使用 jdbc 通过 Hive 连接到 HBase【英文标题】:can't connect to HBase via Hive with jdbc 【发布时间】:2018-03-20 08:29:08 【问题描述】:

环境是:

hadoop 2.6.5 hbase 1.1.1 蜂巢 2.3.2

/etc/hosts:

127.0.0.1 localhost
120.55.x.x iZuf6istfz0***
Hive 有一个名为“hiveInnerTable”的表作为 inner 表。 HBase 有一个名为“hiveExternalTable”的表,它在 Hive 中映射为 external 表。

现象是:

    使用配置单元shell

    (1).通过 Hive shell 成功执行 sql "select count(id) from hiveInnerTable"

    (2).通过 Hive shell 成功执行 sql "select count(id) from hiveExternalTable"

    使用 Hive jdbc

    (1). 使用 jdbc 通过 Hive 执行 sql "select count(id) from hiveInnerTable" 成功

    (2). 通过 Hive 使用 jdbc 执行 sql "select count(id) from hiveExternalTable" failed

Hive 日志中的错误消息是:

Query ID = root_20180320104133_13318980-26ef-4320-a270-d546e6b94ccb
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=36, exceptions:
Tue Mar 20 10:42:23 CST 2018, null, java.net.SocketTimeoutException: callTimeout=60000, callDuration=68449: row 'hiveExternalTable,,00000000000000' on                                                                                         table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=iZuf6istfz0***,47428,1521453094275, seqNum=0

        at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.throwEnrichedException(RpcRetryingCallerWithReadReplicas.java:271)
        at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:223)
        at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:61)
        at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)
        at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:320)
        at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:295)
        at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:160)
        at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:155)
        at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:811)
        at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:193)
        at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:89)
        at org.apache.hadoop.hbase.client.MetaScanner.allTableRegions(MetaScanner.java:324)
        at org.apache.hadoop.hbase.client.HRegionLocator.getAllRegionLocations(HRegionLocator.java:88)
        at org.apache.hadoop.hbase.util.RegionSizeCalculator.init(RegionSizeCalculator.java:94)
        at org.apache.hadoop.hbase.util.RegionSizeCalculator.<init>(RegionSizeCalculator.java:81)
        at org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.getSplits(TableInputFormatBase.java:256)
        at org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat.getSplitsInternal(HiveHBaseTableInputFormat.java:526)
        at org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat.getSplits(HiveHBaseTableInputFormat.java:452)
        at org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:442)
        at org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:561)
        at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getCombineSplits(CombineHiveInputFormat.java:357)
        at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:547)
        at org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:329)
        at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:321)
        at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:197)
        at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1297)
        at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1294)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1692)
        at org.apache.hadoop.mapreduce.Job.submit(Job.java:1294)
        at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562)
        at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1692)
        at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)
        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)
        at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:411)
        at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:151)
        at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:199)
        at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
        at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2183)
        at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1839)
        at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1526)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1237)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1232)
        at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:255)
        at org.apache.hive.service.cli.operation.SQLOperation.access$800(SQLOperation.java:91)
        at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:348)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1692)
        at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:362)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.SocketTimeoutException: callTimeout=60000, callDuration=68449: row 'hiveExternalTable,,00000000000000' on table 'hbase:meta' at re                                                                                        gion=hbase:meta,,1.1588230740, hostname=iZuf6istfz0***,47428,1521453094275, seqNum=0
        at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:159)
        at org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture.run(ResultBoundedCompletionService.java:64)
        ... 3 more
Caused by: org.apache.hadoop.hbase.exceptions.ConnectionClosingException: Call to iZuf6istfz0***/120.55.X.X:47428 failed on local except                                                                                        ion: org.apache.hadoop.hbase.exceptions.ConnectionClosingException: Connection to iZuf6istfz0***/120.55.X.X:47428 is closing. Call id=9,                                                                                         waitTime=2
        at org.apache.hadoop.hbase.ipc.RpcClientImpl.wrapException(RpcClientImpl.java:1239)
        at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1210)
        at org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:213)
        at org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:287)
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:32651)
        at org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:372)
        at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:199)
        at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:62)
        at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)
        at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:369)
        at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:343)
        at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:126)
        ... 4 more
Caused by: org.apache.hadoop.hbase.exceptions.ConnectionClosingException: Connection to iZuf6istfz0***/120.55.X.X:47428 is closing. Call                                                                                         id=9, waitTime=2
        at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.cleanupCalls(RpcClientImpl.java:1037)
        at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.close(RpcClientImpl.java:844)
        at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.run(RpcClientImpl.java:572)
Job Submission failed with exception 'org.apache.hadoop.hbase.client.RetriesExhaustedException(Failed after attempts=36, exceptions:
Tue Mar 20 10:42:23 CST 2018, null, java.net.SocketTimeoutException: callTimeout=60000, callDuration=68449: row 'logRequestRecord,,00000000000000' on                                                                                         table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=iZuf6istfz0***,47428,1521453094275, seqNum=0
)'
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask. Failed after attempts=36, exceptions:
Tue Mar 20 10:42:23 CST 2018, null, java.net.SocketTimeoutException: callTimeout=60000, callDuration=68449: row 'logRequestRecord,,00000000000000' on                                                                                         table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=iZuf6istfz0***,47428,1521453094275, seqNum=0

hive-site.xml:

 .......
 <property>
   <name>hive.metastore.schema.verification</name>
   <value>false</value>
 </property>

 <property>
   <name>javax.jdo.option.ConnectionURL</name>
   <value>jdbc:mysql://localhost:3306/hivemetastore?useUnicode=true&characterEncoding=utf-8&allowMultiQueries=true</value>
 </property>

 <property>
   <name>javax.jdo.option.ConnectionDriverName</name>
   <value>com.mysql.jdbc.Driver</value>
 </property>

 <property>
   <name>javax.jdo.option.ConnectionUserName</name>
   <value>dongpinyun</value>
 </property>
 <property>
   <name>javax.jdo.option.ConnectionPassword</name>
   <value>dongpinyun@hk2016</value>
 </property>

 <property>
 <name>hive.exec.mode.local.auto</name>
 <value>true</value>
 </property>

  <property>
    <name>hbase.zookeeper.quorum</name>
    <value>iZuf6is************</value>
  </property>

  <property>
    <name>hbase.zookeeper.property.clientPort</name>
    <value>2182</value>
  </property>

  <property>
    <name>hive.optimize.cp</name>
    <value>true</value>
  </property>

  <property>
    <name>hive.optimize.pruner</name>
    <value>true</value>
  </property>

  <property>
    <name>hive.zookeeper.quorum</name>
    <value>iZuf6is************</value>
  </property>

  <property>
    <name>hive.zookeeper.client.port</name>
    <value>2181</value>
  </property>
  .......

hbase-site.xml:

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
    <property>
        <name>hbase.rootdir</name>
        <value>hdfs://iZuf6i****:9000/hbase</value>
    </property>
    <property>
        <name>hbase.cluster.distributed</name>
        <value>false</value>
    </property>
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
    <property>
        <name>hbase.zookeeper.quorum</name>
        <value>iZuf6i****:2182</value>
    </property>
    <property>
        <name>hbase.zookeeper.property.clientPort</name>
        <value>2182</value>
    </property>
    <property>
        <name>hbase.zookeeper.property.maxClientCnxns</name>
        <value>50</value>
    </property>
    <property>
        <name>hbase.regionserver.handler.count</name>
        <value>50</value>
    </property>
</configuration>

核心站点.xml:

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/mnt/hadoop/hadoop-2.6.5/tmp</value>
    </property>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://iZuf6i******:9000</value>
    </property>
    <property>
        <name>hadoop.proxyuser.root.hosts</name>
        <value>*</value>
    </property>
    <property>
        <name>hadoop.proxyuser.root.groups</name>
        <value>*</value>
    </property>
</configuration>

【问题讨论】:

【参考方案1】:

我找到了解决方案 在配置文件 hive-site.xml 中

<property>  
    <name>hive.server2.enable.doAs</name>  
    <value>false</value>  
</property>

将值设置为 false

true 表示与登录 hiveserver2 的用户一起执行 hadoop 作业。

false 表示与启动 hiveserver2 的用户一起执行 hadoop 作业。

【讨论】:

以上是关于无法使用 jdbc 通过 Hive 连接到 HBase的主要内容,如果未能解决你的问题,请参考以下文章

无法使用 jdbc 通过 Hive 连接到 HBase

无法通过 jdbc 连接到 hive 3

通过远程jdbc方式连接到hive数据仓库

使用 Hive JDBC 驱动程序通过 Squirrel 连接到 Knox

直线无法连接到 hive2server

在 Spark 中使用 jdbc 驱动程序连接到 Hive