无法使用 jdbc 通过 Hive 连接到 HBase
Posted
技术标签:
【中文标题】无法使用 jdbc 通过 Hive 连接到 HBase【英文标题】:can't connect to HBase via Hive with jdbc 【发布时间】:2018-03-20 08:29:08 【问题描述】:环境是:
hadoop 2.6.5 hbase 1.1.1 蜂巢 2.3.2/etc/hosts:
127.0.0.1 localhost
120.55.x.x iZuf6istfz0***
Hive 有一个名为“hiveInnerTable”的表作为 inner 表。
HBase 有一个名为“hiveExternalTable”的表,它在 Hive 中映射为 external 表。
现象是:
使用配置单元shell:
(1).通过 Hive shell 成功执行 sql "select count(id) from hiveInnerTable"
(2).通过 Hive shell 成功执行 sql "select count(id) from hiveExternalTable"
使用 Hive jdbc:
(1). 使用 jdbc 通过 Hive 执行 sql "select count(id) from hiveInnerTable" 成功
(2). 通过 Hive 使用 jdbc 执行 sql "select count(id) from hiveExternalTable" failed
Hive 日志中的错误消息是:
Query ID = root_20180320104133_13318980-26ef-4320-a270-d546e6b94ccb
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapreduce.job.reduces=<number>
org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=36, exceptions:
Tue Mar 20 10:42:23 CST 2018, null, java.net.SocketTimeoutException: callTimeout=60000, callDuration=68449: row 'hiveExternalTable,,00000000000000' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=iZuf6istfz0***,47428,1521453094275, seqNum=0
at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.throwEnrichedException(RpcRetryingCallerWithReadReplicas.java:271)
at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:223)
at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:61)
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)
at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:320)
at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:295)
at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:160)
at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:155)
at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:811)
at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:193)
at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:89)
at org.apache.hadoop.hbase.client.MetaScanner.allTableRegions(MetaScanner.java:324)
at org.apache.hadoop.hbase.client.HRegionLocator.getAllRegionLocations(HRegionLocator.java:88)
at org.apache.hadoop.hbase.util.RegionSizeCalculator.init(RegionSizeCalculator.java:94)
at org.apache.hadoop.hbase.util.RegionSizeCalculator.<init>(RegionSizeCalculator.java:81)
at org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.getSplits(TableInputFormatBase.java:256)
at org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat.getSplitsInternal(HiveHBaseTableInputFormat.java:526)
at org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat.getSplits(HiveHBaseTableInputFormat.java:452)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:442)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:561)
at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getCombineSplits(CombineHiveInputFormat.java:357)
at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:547)
at org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:329)
at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:321)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:197)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1297)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1294)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1692)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1294)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1692)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)
at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:411)
at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:151)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:199)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2183)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1839)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1526)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1237)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1232)
at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:255)
at org.apache.hive.service.cli.operation.SQLOperation.access$800(SQLOperation.java:91)
at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:348)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1692)
at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:362)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.SocketTimeoutException: callTimeout=60000, callDuration=68449: row 'hiveExternalTable,,00000000000000' on table 'hbase:meta' at re gion=hbase:meta,,1.1588230740, hostname=iZuf6istfz0***,47428,1521453094275, seqNum=0
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:159)
at org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture.run(ResultBoundedCompletionService.java:64)
... 3 more
Caused by: org.apache.hadoop.hbase.exceptions.ConnectionClosingException: Call to iZuf6istfz0***/120.55.X.X:47428 failed on local except ion: org.apache.hadoop.hbase.exceptions.ConnectionClosingException: Connection to iZuf6istfz0***/120.55.X.X:47428 is closing. Call id=9, waitTime=2
at org.apache.hadoop.hbase.ipc.RpcClientImpl.wrapException(RpcClientImpl.java:1239)
at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1210)
at org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:213)
at org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:287)
at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:32651)
at org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:372)
at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:199)
at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:62)
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)
at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:369)
at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:343)
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:126)
... 4 more
Caused by: org.apache.hadoop.hbase.exceptions.ConnectionClosingException: Connection to iZuf6istfz0***/120.55.X.X:47428 is closing. Call id=9, waitTime=2
at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.cleanupCalls(RpcClientImpl.java:1037)
at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.close(RpcClientImpl.java:844)
at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.run(RpcClientImpl.java:572)
Job Submission failed with exception 'org.apache.hadoop.hbase.client.RetriesExhaustedException(Failed after attempts=36, exceptions:
Tue Mar 20 10:42:23 CST 2018, null, java.net.SocketTimeoutException: callTimeout=60000, callDuration=68449: row 'logRequestRecord,,00000000000000' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=iZuf6istfz0***,47428,1521453094275, seqNum=0
)'
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask. Failed after attempts=36, exceptions:
Tue Mar 20 10:42:23 CST 2018, null, java.net.SocketTimeoutException: callTimeout=60000, callDuration=68449: row 'logRequestRecord,,00000000000000' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=iZuf6istfz0***,47428,1521453094275, seqNum=0
hive-site.xml:
.......
<property>
<name>hive.metastore.schema.verification</name>
<value>false</value>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost:3306/hivemetastore?useUnicode=true&characterEncoding=utf-8&allowMultiQueries=true</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>dongpinyun</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>dongpinyun@hk2016</value>
</property>
<property>
<name>hive.exec.mode.local.auto</name>
<value>true</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>iZuf6is************</value>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2182</value>
</property>
<property>
<name>hive.optimize.cp</name>
<value>true</value>
</property>
<property>
<name>hive.optimize.pruner</name>
<value>true</value>
</property>
<property>
<name>hive.zookeeper.quorum</name>
<value>iZuf6is************</value>
</property>
<property>
<name>hive.zookeeper.client.port</name>
<value>2181</value>
</property>
.......
hbase-site.xml:
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://iZuf6i****:9000/hbase</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>false</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>iZuf6i****:2182</value>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2182</value>
</property>
<property>
<name>hbase.zookeeper.property.maxClientCnxns</name>
<value>50</value>
</property>
<property>
<name>hbase.regionserver.handler.count</name>
<value>50</value>
</property>
</configuration>
核心站点.xml:
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/mnt/hadoop/hadoop-2.6.5/tmp</value>
</property>
<property>
<name>fs.defaultFS</name>
<value>hdfs://iZuf6i******:9000</value>
</property>
<property>
<name>hadoop.proxyuser.root.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.root.groups</name>
<value>*</value>
</property>
</configuration>
【问题讨论】:
【参考方案1】:我找到了解决方案 在配置文件 hive-site.xml 中
<property>
<name>hive.server2.enable.doAs</name>
<value>false</value>
</property>
将值设置为 false。
true 表示与登录 hiveserver2 的用户一起执行 hadoop 作业。
false 表示与启动 hiveserver2 的用户一起执行 hadoop 作业。
【讨论】:
以上是关于无法使用 jdbc 通过 Hive 连接到 HBase的主要内容,如果未能解决你的问题,请参考以下文章