Spark 2.0.0 在与 Hive 1.0.0 交互时抛出 AlreadyExistsException(message:Database default already exists)

Posted

技术标签:

【中文标题】Spark 2.0.0 在与 Hive 1.0.0 交互时抛出 AlreadyExistsException(message:Database default already exists)【英文标题】:Spark 2.0.0 throw AlreadyExistsException(message:Database default already exists) when interact with Hive 1.0.0 【发布时间】:2016-10-06 04:18:13 【问题描述】:

我正在尝试使用 Spark Java 连接到 Hive。当我通过 Spark 在 Hive 中运行任何查询时,它会返回如下异常:

16/10/06 09:37:56 ERROR metastore.RetryingHMSHandler: AlreadyExistsException(message:Database default already exists)

我的版本是:

Spark 2.0.0

Hive 1.0.0

这是我的完整堆栈:

16/10/06 09:37:56 ERROR metastore.RetryingHMSHandler: AlreadyExistsException(message:Database default already exists)
    at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_database(HiveMetaStore.java:891)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
    at com.sun.proxy.$Proxy14.create_database(Unknown Source)
    at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createDatabase(HiveMetaStoreClient.java:644)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:156)
    at com.sun.proxy.$Proxy15.createDatabase(Unknown Source)
    at org.apache.hadoop.hive.ql.metadata.Hive.createDatabase(Hive.java:306)
    at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$createDatabase$1.apply$mcV$sp(HiveClientImpl.scala:291)
    at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$createDatabase$1.apply(HiveClientImpl.scala:291)
    at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$createDatabase$1.apply(HiveClientImpl.scala:291)
    at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:262)
    at org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:209)
    at org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:208)
    at org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:251)
    at org.apache.spark.sql.hive.client.HiveClientImpl.createDatabase(HiveClientImpl.scala:290)
    at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$createDatabase$1.apply$mcV$sp(HiveExternalCatalog.scala:99)
    at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$createDatabase$1.apply(HiveExternalCatalog.scala:99)
    at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$createDatabase$1.apply(HiveExternalCatalog.scala:99)
    at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:72)
    at org.apache.spark.sql.hive.HiveExternalCatalog.createDatabase(HiveExternalCatalog.scala:98)
    at org.apache.spark.sql.catalyst.catalog.SessionCatalog.createDatabase(SessionCatalog.scala:147)
    at org.apache.spark.sql.catalyst.catalog.SessionCatalog.<init>(SessionCatalog.scala:89)
    at org.apache.spark.sql.hive.HiveSessionCatalog.<init>(HiveSessionCatalog.scala:51)
    at org.apache.spark.sql.hive.HiveSessionState.catalog$lzycompute(HiveSessionState.scala:49)
    at org.apache.spark.sql.hive.HiveSessionState.catalog(HiveSessionState.scala:48)
    at org.apache.spark.sql.hive.HiveSessionState$$anon$1.<init>(HiveSessionState.scala:63)
    at org.apache.spark.sql.hive.HiveSessionState.analyzer$lzycompute(HiveSessionState.scala:63)
    at org.apache.spark.sql.hive.HiveSessionState.analyzer(HiveSessionState.scala:62)
    at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:49)
    at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:64)
    at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:582)
    at in.inndata.sparkjoinsexamples.SparkJoinExample.main(SparkJoinExample.java:10)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:729)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:185)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:210)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:124)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

16/10/06 09:37:57 INFO metastore.HiveMetaStore: 0: get_database: default
16/10/06 09:37:57 INFO HiveMetaStore.audit: ugi=karuturi    ip=unknown-ip-addr  cmd=get_database: default   
16/10/06 09:37:57 INFO metastore.HiveMetaStore: 0: get_table : db=default tbl=aaa
16/10/06 09:37:57 INFO HiveMetaStore.audit: ugi=karuturi    ip=unknown-ip-addr  cmd=get_table : db=default tbl=aaa  
Exception in thread "main" org.apache.spark.sql.AnalysisException: Table or view not found: `default`.`aaa`; line 1 pos 14
    at org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42)
    at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:71)
    at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:67)
    at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:126)
    at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:125)
    at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:125)
    at scala.collection.immutable.List.foreach(List.scala:381)
    at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:125)
    at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.checkAnalysis(CheckAnalysis.scala:67)
    at org.apache.spark.sql.catalyst.analysis.Analyzer.checkAnalysis(Analyzer.scala:58)
    at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:49)
    at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:64)
    at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:582)
    at in.inndata.sparkjoinsexamples.SparkJoinExample.main(SparkJoinExample.java:10)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:729)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:185)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:210)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:124)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
16/10/06 09:37:57 INFO spark.SparkContext: Invoking stop() from shutdown hook
16/10/06 09:37:57 INFO server.ServerConnector: Stopped ServerConnector@3f20e4faHTTP/1.10.0.0.0:4040
16/10/06 09:37:57 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@1bdf8190/stages/stage/kill,null,UNAVAILABLE
16/10/06 09:37:57 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@4f8969b0/api,null,UNAVAILABLE
16/10/06 09:37:57 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@6fefce9e/,null,UNAVAILABLE
16/10/06 09:37:57 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@74cec793/static,null,UNAVAILABLE
16/10/06 09:37:57 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@f9b7332/executors/threadDump/json,null,UNAVAILABLE
16/10/06 09:37:57 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@18e7143f/executors/threadDump,null,UNAVAILABLE
16/10/06 09:37:57 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@209775a9/executors/json,null,UNAVAILABLE
16/10/06 09:37:57 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@5db4c359/executors,null,UNAVAILABLE
16/10/06 09:37:57 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@2c177f9e/environment/json,null,UNAVAILABLE
16/10/06 09:37:57 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@33617539/environment,null,UNAVAILABLE
16/10/06 09:37:57 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@47874b25/storage/rdd/json,null,UNAVAILABLE
16/10/06 09:37:57 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@290b1b2e/storage/rdd,null,UNAVAILABLE
16/10/06 09:37:57 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@1fc0053e/storage/json,null,UNAVAILABLE
16/10/06 09:37:57 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@77307458/storage,null,UNAVAILABLE
16/10/06 09:37:57 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@389adf1d/stages/pool/json,null,UNAVAILABLE
16/10/06 09:37:57 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@7bf9b098/stages/pool,null,UNAVAILABLE
16/10/06 09:37:57 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@72e34f77/stages/stage/json,null,UNAVAILABLE
16/10/06 09:37:57 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@6e9319f/stages/stage,null,UNAVAILABLE
16/10/06 09:37:57 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@6fa590ba/stages/json,null,UNAVAILABLE
16/10/06 09:37:57 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@2416a51/stages,null,UNAVAILABLE
16/10/06 09:37:57 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@293bb8a5/jobs/job/json,null,UNAVAILABLE
16/10/06 09:37:57 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@37ebc9d8/jobs/job,null,UNAVAILABLE
16/10/06 09:37:57 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@5217f3d0/jobs/json,null,UNAVAILABLE
16/10/06 09:37:57 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@32232e55/jobs,null,UNAVAILABLE
16/10/06 09:37:57 INFO ui.SparkUI: Stopped Spark web UI at http://192.168.1.131:4040
16/10/06 09:37:57 INFO spark.MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
16/10/06 09:37:57 INFO memory.MemoryStore: MemoryStore cleared
16/10/06 09:37:57 INFO storage.BlockManager: BlockManager stopped
16/10/06 09:37:57 INFO storage.BlockManagerMaster: BlockManagerMaster stopped
16/10/06 09:37:57 INFO scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
16/10/06 09:37:57 INFO spark.SparkContext: Successfully stopped SparkContext
16/10/06 09:37:57 INFO util.ShutdownHookManager: Shutdown hook called
16/10/06 09:37:57 INFO util.ShutdownHookManager: Deleting directory /private/var/folders/6n/nrvn14r50tvdvcfdds6jxyx40000gn/T/spark-b0f5733d-a475-4289-956d-c2650d9792d0

这是我的 Spark 代码:

public static void main(String[] arr)
        SparkSession session = new SparkSession.Builder().appName("SparkJoinExample").master("local").enableHiveSupport().getOrCreate();
        Dataset<Row> dset = session.sql("select * from test.results");

    

【问题讨论】:

您要运行什么查询?根据它说找不到表的日志 @ArunakiranNulu,感谢您的回复。我刚刚更新了我的问题。请看一下这个 看起来像已知问题,请查看issues.apache.org/jira/browse/SPARK-15345 上的最后评论中给出的解决方法 【参考方案1】:

它将在你的 $HOME 文件夹中创建一个 metastore,在 metastore 中只需删除 dbex.lck

【讨论】:

【参考方案2】:

您可能创建了一个同名的数据库。删除 metastore-db 文件夹以删除与其关联的元数据,然后重试。为我工作

【讨论】:

以上是关于Spark 2.0.0 在与 Hive 1.0.0 交互时抛出 AlreadyExistsException(message:Database default already exists)的主要内容,如果未能解决你的问题,请参考以下文章

spark 2.0.0集群安装与hive on spark配置

在数据集中拆分字符串 Apache Spark

如何通过 Hive 中数组的总和对数组进行归一化?

使用 Spark 将列转置为行

在 Spark 上打印查询 Hive 的物理计划

棋盘游戏的 int 数组