如何将嵌套的json转换为数据框[重复]

Posted

技术标签:

【中文标题】如何将嵌套的json转换为数据框[重复]【英文标题】:How to covert the nested json to datafarme [duplicate] 【发布时间】:2019-06-17 17:36:21 【问题描述】:

我有 json 数据,我想将 json 数据转换成数据帧

[
FlierNumber:,BaggageTypeReturn:,FirstName:K,Title:1,MiddleName:D,LastName:Gupta,MealTypeOnward:,DateOfBirth:,BaggageTypeOnward:,SeatTypeOnward:,MealTypeReturn:,FrequentAirline:null,Type:A,SeatTypeReturn:,
FlierNumber:,BaggageTypeReturn:,FirstName:Sweety,Title:2,MiddleName:,LastName:Gupta,MealTypeOnward:,DateOfBirth:,BaggageTypeOnward:,SeatTypeOnward:,MealTypeReturn:,FrequentAirline:null,Type:A,SeatTypeReturn:
]

【问题讨论】:

请查看***.com/questions/44456076/… 【参考方案1】:

您在上面提供的 JSON 无效。这是语法正确的 JSON 格式

["FlierNumber":"","BaggageTypeReturn":"","FirstName":"K","Title":"1","MiddleName":"D","LastName":"Gupta","MealTypeOnward":"","DateOfBirth":"","BaggageTypeOnward":"","SeatTypeOnward":"","MealTypeReturn":"","FrequentAirline":"null","Type":"A","SeatTypeReturn":"","FlierNumber":"","BaggageTypeReturn":"","FirstName":"Sweety","Title":"2","MiddleName":"","LastName":"Gupta","MealTypeOnward":"","DateOfBirth":"","BaggageTypeOnward":"","SeatTypeOnward":"","MealTypeReturn":"","FrequentAirline":"null","Type":"A","SeatTypeReturn":""]

如果它存在于文件中,您可以直接使用

在 spark 中读取
  val jsonDF = spark.read.json("filepath\sample.json")
  jsonDF.printSchema()
  jsonDF.show

结果是:

root
 |-- BaggageTypeOnward: string (nullable = true)
 |-- BaggageTypeReturn: string (nullable = true)
 |-- DateOfBirth: string (nullable = true)
 |-- FirstName: string (nullable = true)
 |-- FlierNumber: string (nullable = true)
 |-- FrequentAirline: string (nullable = true)
 |-- LastName: string (nullable = true)
 |-- MealTypeOnward: string (nullable = true)
 |-- MealTypeReturn: string (nullable = true)
 |-- MiddleName: string (nullable = true)
 |-- SeatTypeOnward: string (nullable = true)
 |-- SeatTypeReturn: string (nullable = true)
 |-- Title: string (nullable = true)
 |-- Type: string (nullable = true)


+-----------------+-----------------+-----------+---------+-----------+---------------+--------+--------------+--------------+----------+--------------+--------------+-----+----+
|BaggageTypeOnward|BaggageTypeReturn|DateOfBirth|FirstName|FlierNumber|FrequentAirline|LastName|MealTypeOnward|MealTypeReturn|MiddleName|SeatTypeOnward|SeatTypeReturn|Title|Type|
+-----------------+-----------------+-----------+---------+-----------+---------------+--------+--------------+--------------+----------+--------------+--------------+-----+----+
|                 |                 |           |        K|           |           null|   Gupta|              |              |         D|              |              |    1|   A|
|                 |                 |           |   Sweety|           |           null|   Gupta|              |              |          |              |              |    2|   A|
+-----------------+-----------------+-----------+---------+-----------+---------------+--------+--------------+--------------+----------+--------------+--------------+-----+----+

【讨论】:

以上是关于如何将嵌套的json转换为数据框[重复]的主要内容,如果未能解决你的问题,请参考以下文章

如何将数据框转换为 JSON [重复]

我将如何将数据框的一部分转换为 json 对象 [重复]

如何将 json 转换为 pyspark 数据帧(更快的实现)[重复]

Python嵌套字典到数据框[重复]

如何将具有嵌套元素的列从其他列添加到数据框(withColumn)[重复]

如何将嵌套的json结构转换为数据框