Spark 不读取 hive-site.xml?
Posted
技术标签:
【中文标题】Spark 不读取 hive-site.xml?【英文标题】:Spark not reading hive-site.xml? 【发布时间】:2017-08-02 11:49:18 【问题描述】:我正在尝试访问 hive 元存储,为此我正在使用 SparkSql。我已经设置了 sparksession ,但是当我运行我的程序并查看日志时,我看到了这个异常
Caused by: java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522)
at org.apache.spark.sql.hive.client.HiveClientImpl.<init>(HiveClientImpl.scala:188)
... 61 more
Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1523)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:86)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:132)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:104)
at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3005)
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3024)
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:503)
... 62 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1521)
... 68 more
Caused by: javax.jdo.JDOFatalDataStoreException: Unable to open a test connection to the given database. JDBC url = jdbc:derby:;databaseName=metastore_db;create=true, username = APP. Terminating connection pool (set lazyInit to true if you expect to start your database after your app). Original Exception: ------
java.sql.SQLException: Failed to create database 'metastore_db', see the next exception for details.
我正在运行一个访问以下代码的 servlet
public class HiveReadone extends HttpServlet
private static final long serialVersionUID = 1L;
/**
* @see HttpServlet#HttpServlet()
*/
public HiveReadone()
super();
// TODO Auto-generated constructor stub
/**
* @see HttpServlet#doGet(HttpServletRequest request, HttpServletResponse response)
*/
protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException
// TODO Auto-generated method stub
response.getWriter().append("Served at: ").append(request.getContextPath());
SparkSession spark = SparkSession
.builder()
.appName("Java Spark SQL basic example")
.enableHiveSupport()
.config("spark.sql.warehouse.dir", "hdfs://saurab:9000/user/hive/warehouse")
.config("mapred.input.dir.recursive", true)
.config("hive.mapred.supports.subdirectories", true)
.config("hive.vectorized.execution.enabled", true)
.master("local")
.getOrCreate();
response.getWriter().println(spark);
浏览器接受来自response.getWriter().append("Served at:
").append(request.getContextPath());
(即Served at: /hiveServ
)的输出,不会打印任何内容
请看我的conf/hive-site.xml
<property>
<name>hive.metastore.schema.verification</name>
<value>false</value>
</property>
<property>
<name>hive.server2.enable.doAs</name>
<value>true</value>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://saurab:3306/metastore_db?createDatabaseIfNotExist=true</value>
<description>metadata is stored in a MySQL server</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>MySQL JDBC driver class</description>
</property>
<property>
<name>hive.aux.jars.path</name>
<value>/home/saurab/hadoopec/hive/lib/hive-serde-2.1.1.jar</value>
</property>
<property>
<name>spark.sql.warehouse.dir</name>
<value>hdfs://saurab:9000/user/hive/warehouse</value>
</property>
<property>
<name>hive.metastore.uris</name>
<!--Make sure that <value> points to the Hive Metastore URI in your cluster -->
<value>thrift://saurab:9083</value>
<description>URI for client to contact metastore server</description>
</property>
<property>
<name>hive.server2.thrift.port</name>
<value>10001</value>
<description>Port number of HiveServer2 Thrift interface.
Can be overridden by setting $HIVE_SERVER2_THRIFT_PORT
</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hiveuser</value>
<description>user name for connecting to mysql server</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>hivepassword</value>
<description>password for connecting to mysql server</description>
</property>
据我所知,如果我们配置 hive.metastore.uris
,spark 将连接到 hive 元存储,但在我的情况下它不是并且给了我上述错误。
【问题讨论】:
【参考方案1】:要在 hive 上配置 spark,请尝试将 hive-site.xml 复制到 spark/conf 目录
【讨论】:
感谢您的输入,但正如我所提到的,我已经在 /conf 目录中创建了 hive-site.xml /hive/conf 中是否也有相同的 hive-site.xml?以上是关于Spark 不读取 hive-site.xml?的主要内容,如果未能解决你的问题,请参考以下文章
通过配置hive-site.xml文件实现Hive集成Spark
通过配置hive-site.xml文件实现Hive集成Spark