spark数据框的转换

Posted 2023-04-15

技术标签:

【中文标题】spark数据框的转换【英文标题】：transformation of spark dataframe 【发布时间】：2017-09-26 14:01:34 【问题描述】：

我有一个DataFrame，这是架构。 element 的数量未知，但某些元素（例如 element1 和 element3）必须存在且唯一性

root
 |-- context: struct (nullable = true)
 |---|-- key: string (nullable = true)
 |   |-- data: struct (nullable = true)
 |   |    |-- dimensions: array (nullable = true)
 |   |    |    |-- element: struct (containsNull = true)
 |   |    |    |    |-- element1: string (nullable = true)
 |   |    |    |    |-- element2: string (nullable = true)
 |   |    |    |    |-- element3: string (nullable = true)
 |   |    |    |    |-- ***     : string (nullable = true)
 |   |    |    |    |-- elementN: string (nullable = true)

如何将其转换为这样的架构？

root
 |-- context: struct (nullable = true)
 |---|-- key: string (nullable = true)
 |---|-- element1: string (nullable = true)
 |---|-- element3: string (nullable = true)

非常感谢。

【问题讨论】：

【参考方案1】：

请您试试explode 功能。这些是以下链接，请通过它们。

Extract columns in nested Spark DataFrame

Extract value from structure within an array of arrays in spark using scala

【讨论】：

以上是关于spark数据框的转换的主要内容，如果未能解决你的问题，请参考以下文章