2.1spark shell中使用hive
Posted 专治spark
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了2.1spark shell中使用hive相关的知识,希望对你有一定的参考价值。
显示所有数据库
scala> val df = spark.sql("show databases") df: org.apache.spark.sql.DataFrame = [databaseName: string] scala> df.show +------------+ |databaseName| +------------+ | bigdata| | default| | lx| +------------+
选择数据库并显示所有表
scala> spark.sql("use lx").show ++ || ++ ++ scala> spark.sql("show tables").show +--------+---------+-----------+ |database|tableName|isTemporary| +--------+---------+-----------+ | lx| cource| false| | lx| student| false| | lx| tmp| false| | lx| www| false| +--------+---------+-----------+
查询表数据
scala> spark.sql("select * from sg").show(100,false) //100条记录全显示,不截断 +---+---+-----+ |sno|cno|grade| +---+---+-----+ |1 |5 |50 | |1 |3 |70 | |2 |1 |40 | |3 |6 |50 | |4 |5 |80 | |4 |5 |70 | |6 |5 |60 | |7 |2 |40 | |8 |4 |50 | +---+---+-----+
RDD -- DataFrame -- select API
-- 创建临时表 -- 查询
//构造RDD
scala> val rdd1 = sc.parallelize(Array((1,"tom1",12),(2,"tom2",13),(3,"tom3",14))) rdd1: org.apache.spark.rdd.RDD[(Int, String, Int)] = ParallelCollectionRDD[29] at parallelize at <console>:24 //转换RDD成DataFrame scala> val df = rdd1.toDF("id","name","age") df: org.apache.spark.sql.DataFrame = [id: int, name: string ... 1 more field] //通过DataFrame select API实现sql中的select语句 scala> df.select("id","age").show() +---+---+ | id|age| +---+---+ | 1| 12| | 2| 13| | 3| 14| +---+---+ scala> df.create createGlobalTempView createOrReplaceTempView createTempView //创建或替换临时表 scala> df.createOrReplaceTempView def createOrReplaceTempView(viewName: String): Unit scala> df.createOrReplaceTempView("stuTable") //通过临时表操作数据 scala> spark.sql("select * from stuTable").show(100,false) +---+----+---+ |id |name|age| +---+----+---+ |1 |tom1|12 | |2 |tom2|13 | |3 |tom3|14 | +---+----+---+
以上是关于2.1spark shell中使用hive的主要内容,如果未能解决你的问题,请参考以下文章
如何在 spark-shell (spark 2.1.1) 中启用对 spark 的 Hive 支持
Spark 连接hive,启动spark-shell报错:Error creating transactional connection factory
Spark Shell 的 Spark Session 中不显示新插入的 Hive 记录
Spark记录-Spark-Shell客户端操作读取Hive数据
SparkSQL介绍与Hive整合Spark的th/beeline/jdbc/thriftserve2shell方式使用SQL