Spark2 DataFrame数据框常用操作之cube与rollup

Posted 智能先行者

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Spark2 DataFrame数据框常用操作之cube与rollup相关的知识,希望对你有一定的参考价值。

val df6 = spark.sql("select gender,children,max(age),avg(age),count(age) from Affairs group by Cube(gender,children) order by 1,2")
df6.show
+------+--------+--------+--------+----------+                                  
|gender|children|max(age)|avg(age)|count(age)|
+------+--------+--------+--------+----------+
|  null|    null|    57.0|    34.0|        10|
|  null|      no|    37.0|    27.0|         6|
|  null|     yes|    57.0|    44.5|         4|
|female|    null|    32.0|    29.0|         5|
|female|      no|    32.0|    27.0|         3|
|female|     yes|    32.0|    32.0|         2|
|  male|    null|    57.0|    39.0|         5|
|  male|      no|    37.0|    27.0|         3|
|  male|     yes|    57.0|    57.0|         2|
+------+--------+--------+--------+----------+


val df7 = spark.sql("select gender,children,max(age),avg(age),count(age) from Affairs group by rollup(gender,children) order by 1,2")

df7.show
+------+--------+--------+--------+----------+                                  
|gender|children|max(age)|avg(age)|count(age)|
+------+--------+--------+--------+----------+
|  null|    null|    57.0|    34.0|        10|
|female|    null|    32.0|    29.0|         5|
|female|      no|    32.0|    27.0|         3|
|female|     yes|    32.0|    32.0|         2|
|  male|    null|    57.0|    39.0|         5|
|  male|      no|    37.0|    27.0|         3|
|  male|     yes|    57.0|    57.0|         2|
+------+--------+--------+--------+----------+

 

以上是关于Spark2 DataFrame数据框常用操作之cube与rollup的主要内容,如果未能解决你的问题,请参考以下文章

Spark2 DataFrame数据框常用操作之统计指标:mean均值,variance方差,stddev标准差,corr(Pearson相关系数),skewness偏度,kurtosis峰度((代码

Spark2 DataFrame数据框常用操作

Spark2 加载保存文件,数据文件转换成数据框dataframe

Spark2加载保存文件,数据文件转换成数据框dataframe

R语言Data Frame数据框常用操作

R语言dataframe的常用操作总结