Spark2 DataFrame数据框常用操作之cube与rollup
Posted 智能先行者
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Spark2 DataFrame数据框常用操作之cube与rollup相关的知识,希望对你有一定的参考价值。
val df6 = spark.sql("select gender,children,max(age),avg(age),count(age) from Affairs group by Cube(gender,children) order by 1,2") df6.show +------+--------+--------+--------+----------+ |gender|children|max(age)|avg(age)|count(age)| +------+--------+--------+--------+----------+ | null| null| 57.0| 34.0| 10| | null| no| 37.0| 27.0| 6| | null| yes| 57.0| 44.5| 4| |female| null| 32.0| 29.0| 5| |female| no| 32.0| 27.0| 3| |female| yes| 32.0| 32.0| 2| | male| null| 57.0| 39.0| 5| | male| no| 37.0| 27.0| 3| | male| yes| 57.0| 57.0| 2| +------+--------+--------+--------+----------+ val df7 = spark.sql("select gender,children,max(age),avg(age),count(age) from Affairs group by rollup(gender,children) order by 1,2") df7.show +------+--------+--------+--------+----------+ |gender|children|max(age)|avg(age)|count(age)| +------+--------+--------+--------+----------+ | null| null| 57.0| 34.0| 10| |female| null| 32.0| 29.0| 5| |female| no| 32.0| 27.0| 3| |female| yes| 32.0| 32.0| 2| | male| null| 57.0| 39.0| 5| | male| no| 37.0| 27.0| 3| | male| yes| 57.0| 57.0| 2| +------+--------+--------+--------+----------+
以上是关于Spark2 DataFrame数据框常用操作之cube与rollup的主要内容,如果未能解决你的问题,请参考以下文章
Spark2 DataFrame数据框常用操作之统计指标:mean均值,variance方差,stddev标准差,corr(Pearson相关系数),skewness偏度,kurtosis峰度((代码
Spark2 加载保存文件,数据文件转换成数据框dataframe