如何将嵌套的json转换为数据框[重复]
Posted
技术标签:
【中文标题】如何将嵌套的json转换为数据框[重复]【英文标题】:How to covert the nested json to datafarme [duplicate] 【发布时间】:2019-06-17 17:36:21 【问题描述】:我有 json 数据,我想将 json 数据转换成数据帧
[
FlierNumber:,BaggageTypeReturn:,FirstName:K,Title:1,MiddleName:D,LastName:Gupta,MealTypeOnward:,DateOfBirth:,BaggageTypeOnward:,SeatTypeOnward:,MealTypeReturn:,FrequentAirline:null,Type:A,SeatTypeReturn:,
FlierNumber:,BaggageTypeReturn:,FirstName:Sweety,Title:2,MiddleName:,LastName:Gupta,MealTypeOnward:,DateOfBirth:,BaggageTypeOnward:,SeatTypeOnward:,MealTypeReturn:,FrequentAirline:null,Type:A,SeatTypeReturn:
]
【问题讨论】:
请查看***.com/questions/44456076/… 【参考方案1】:您在上面提供的 JSON 无效。这是语法正确的 JSON 格式
["FlierNumber":"","BaggageTypeReturn":"","FirstName":"K","Title":"1","MiddleName":"D","LastName":"Gupta","MealTypeOnward":"","DateOfBirth":"","BaggageTypeOnward":"","SeatTypeOnward":"","MealTypeReturn":"","FrequentAirline":"null","Type":"A","SeatTypeReturn":"","FlierNumber":"","BaggageTypeReturn":"","FirstName":"Sweety","Title":"2","MiddleName":"","LastName":"Gupta","MealTypeOnward":"","DateOfBirth":"","BaggageTypeOnward":"","SeatTypeOnward":"","MealTypeReturn":"","FrequentAirline":"null","Type":"A","SeatTypeReturn":""]
如果它存在于文件中,您可以直接使用
在 spark 中读取 val jsonDF = spark.read.json("filepath\sample.json")
jsonDF.printSchema()
jsonDF.show
结果是:
root
|-- BaggageTypeOnward: string (nullable = true)
|-- BaggageTypeReturn: string (nullable = true)
|-- DateOfBirth: string (nullable = true)
|-- FirstName: string (nullable = true)
|-- FlierNumber: string (nullable = true)
|-- FrequentAirline: string (nullable = true)
|-- LastName: string (nullable = true)
|-- MealTypeOnward: string (nullable = true)
|-- MealTypeReturn: string (nullable = true)
|-- MiddleName: string (nullable = true)
|-- SeatTypeOnward: string (nullable = true)
|-- SeatTypeReturn: string (nullable = true)
|-- Title: string (nullable = true)
|-- Type: string (nullable = true)
+-----------------+-----------------+-----------+---------+-----------+---------------+--------+--------------+--------------+----------+--------------+--------------+-----+----+
|BaggageTypeOnward|BaggageTypeReturn|DateOfBirth|FirstName|FlierNumber|FrequentAirline|LastName|MealTypeOnward|MealTypeReturn|MiddleName|SeatTypeOnward|SeatTypeReturn|Title|Type|
+-----------------+-----------------+-----------+---------+-----------+---------------+--------+--------------+--------------+----------+--------------+--------------+-----+----+
| | | | K| | null| Gupta| | | D| | | 1| A|
| | | | Sweety| | null| Gupta| | | | | | 2| A|
+-----------------+-----------------+-----------+---------+-----------+---------------+--------+--------------+--------------+----------+--------------+--------------+-----+----+
【讨论】:
以上是关于如何将嵌套的json转换为数据框[重复]的主要内容,如果未能解决你的问题,请参考以下文章
如何将 json 转换为 pyspark 数据帧(更快的实现)[重复]