Spark连接Hive
Posted Shall潇
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Spark连接Hive相关的知识,希望对你有一定的参考价值。
一、配置hive-site.xml
将hive/conf/hive-site.xml文件拷贝到spark/conf 下
修改hive/conf/hive-site.xml
<configuration>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/opt/soft/hive/warehouse</value>
</property>
<!--<property>
<name>hive.metastore.local</name>
<value>true</value>
</property>-->
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://hadoop100:3306/hive?createDatabaseIfNotExist=true</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>root</value>
</property>
<property>
<name>hive.cli.print.header</name>
<value>true</value>
</property>
<property>
<name>hive.cli.print.current.db</name>
<value>true</value>
</property>
<property>
<name>hive.exec.mode.local.auto</name>
<value>true</value>
</property>
<property>
<name>hive.server2.thrift.client.user</name>
<value>root</value>
<description>Username to use against thrift client</description>
</property>
<property>
<name>hive.server2.thrift.client.password</name>
<value>root</value>
<description>Password to use against thrift client</description>
</property>
<property>
<name>hive.metastore.uris</name>
<value>thrift://192.168.XXX.100:9083</value>
</property>
</configuration>
二、启动服务
- 启动 hive 元数据服务
- 启动 hiveserver2 服务
nohup /opt/soft/hive/bin/hive --service metastore &
nohup /opt/soft/hive/bin/hive --service hiveserver2 &
进入spark
spark-shell
spark.table("库名.表名").show //查看表内容
三、idea如何实现连接
1、添加依赖
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.1.1</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.11</artifactId>
<version>2.1.1</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-hive_2.11</artifactId>
<version>2.1.1</version>
</dependency>
</dependencies>
2、编写程序
package Hive
import org.apache.spark.sql.SparkSession
object SparkToHive {
def main(args: Array[String]): Unit = {
val spark = SparkSession.builder().appName("sparkToHive").master("local[*]")
.config("hive.metastore.uris","thrift://192.168.159.100:9083")
.enableHiveSupport()
.getOrCreate()
spark.sql("show databases").collect.foreach(println)
// val df = spark.sql("select * from emp.emp_basic")
// df.show()
}
}
以上是关于Spark连接Hive的主要内容,如果未能解决你的问题,请参考以下文章
Spark之HiveSupport连接(spark-shell和IDEA)
本地Spark连接远程集群Hive(Scala/Python)
Spark 连接hive,启动spark-shell报错:Error creating transactional connection factory