将这个json文件作为数据框放入R中
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了将这个json文件作为数据框放入R中相关的知识,希望对你有一定的参考价值。
如何将此json文件(https://ix.cnn.io/data/novel-coronavirus-2019-ncov/us/historical.min.json)作为数据框引入?
我尝试了几种方法都无济于事。
答案
b <- jsonlite::fromJSON('https://ix.cnn.io/data/novel-coronavirus-2019-ncov/us/historical.min.json')
tidyr::unnest(b$data, cols = "data")
# # A tibble: 2,233 x 6
# usps name fips date cases deaths
# <chr> <chr> <chr> <chr> <int> <int>
# 1 GU Guam 66 2020-03-16 3 0
# 2 GU Guam 66 2020-03-17 3 0
# 3 GU Guam 66 2020-03-18 5 0
# 4 GU Guam 66 2020-03-19 12 0
# 5 GU Guam 66 2020-03-20 14 0
# 6 GU Guam 66 2020-03-21 15 0
# 7 GU Guam 66 2020-03-22 27 1
# 8 GU Guam 66 2020-03-23 29 1
# 9 GU Guam 66 2020-03-24 32 1
# 10 GU Guam 66 2020-03-25 37 1
# # ... with 2,223 more rows
注意,由于AS
没有数据(请参见下文,第一帧具有0个观测值),因此将其从列表中过滤掉。要解决此问题:
unnest(b$data, cols = "data") %>%
filter(usps == "AS")
# # A tibble: 0 x 6
# # ... with 6 variables: usps <chr>, name <chr>, fips <chr>, date <chr>, cases <int>,
# # deaths <int>
lengths(b$data$data)
# [1] 0 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
# [46] 3 3 3 3 3 3 3 3 3 3 3 3 3
onegood <- Filter(nrow, b$data$data)[[1]]
head(onegood)
# date cases deaths
# 1 2020-03-16 3 0
# 2 2020-03-17 3 0
# 3 2020-03-18 5 0
# 4 2020-03-19 12 0
# 5 2020-03-20 14 0
# 6 2020-03-21 15 0
onegood <- onegood[NA,][1,]
head(onegood)
# date cases deaths
# NA <NA> NA NA
hasnothing <- lengths(b$data$data) < 1
which(hasnothing)
# [1] 1
b$data$data[ hasnothing ] <- replicate(sum(hasnothing), onegood, simplify = FALSE)
### now prove that we see `AS` data
unnest(b$data, cols = "data") %>%
filter(usps == "AS")
# # A tibble: 1 x 6
# usps name fips date cases deaths
# <chr> <chr> <chr> <chr> <int> <int>
# 1 AS American Samoa 60 <NA> NA NA
unnest(b$data, cols = "data")
# # A tibble: 2,234 x 6
# usps name fips date cases deaths
# <chr> <chr> <chr> <chr> <int> <int>
# 1 AS American Samoa 60 <NA> NA NA
# 2 GU Guam 66 2020-03-16 3 0
# 3 GU Guam 66 2020-03-17 3 0
# 4 GU Guam 66 2020-03-18 5 0
# 5 GU Guam 66 2020-03-19 12 0
# 6 GU Guam 66 2020-03-20 14 0
# 7 GU Guam 66 2020-03-21 15 0
# 8 GU Guam 66 2020-03-22 27 1
# 9 GU Guam 66 2020-03-23 29 1
# 10 GU Guam 66 2020-03-24 32 1
# # ... with 2,224 more rows
我创建了onegood
,以便以编程方式根据当前数据创建了一个代表性的NA
框架。手动创建它绝对容易,但是我希望在以后添加更多列时保持灵活性。
回填:
str(b)
# List of 3
# $ lastUpdated : chr "2020-04-15T23:55:39Z"
# $ lastUpdatedStr: chr "April 15, 2020 at 7:55 p.m. ET"
# $ data :'data.frame': 58 obs. of 4 variables:
# ..$ usps: chr [1:58] "AS" "GU" "MP" "PR" ...
# ..$ name: chr [1:58] "American Samoa" "Guam" "Northern Mariana Islands" "Puerto Rico" ...
# ..$ fips: chr [1:58] "60" "66" "69" "72" ...
# ..$ data:List of 58
# .. ..$ :'data.frame': 0 obs. of 0 variables
# .. ..$ :'data.frame': 31 obs. of 3 variables:
# .. .. ..$ date : chr [1:31] "2020-03-16" "2020-03-17" "2020-03-18" "2020-03-19" ...
# .. .. ..$ cases : int [1:31] 3 3 5 12 14 15 27 29 32 37 ...
# .. .. ..$ deaths: int [1:31] 0 0 0 0 0 0 1 1 1 1 ...
# .. ..$ :'data.frame': 16 obs. of 3 variables:
# .. .. ..$ date : chr [1:16] "2020-03-31" "2020-04-01" "2020-04-02" "2020-04-03" ...
# .. .. ..$ cases : int [1:16] 2 6 6 8 8 8 8 8 11 11 ...
# .. .. ..$ deaths: int [1:16] 0 1 1 1 1 1 1 2 2 2 ...
# <truncated>
另一答案
我能够将此json文件添加到数据框中,但是您还需要取消列出嵌套列表。这是对我有用的方法:
library(rjson)
library(tidyr)
data <- fromJSON("https://ix.cnn.io/data/novel-coronavirus-2019-ncov/us/historical.min.json")
data <- as_tibble(data$data)
df <- data %>% unnest(c(usps,name,fips,data))
head(df)
> head(df)
# A tibble: 6 x 6
usps name fips date cases deaths
<chr> <chr> <chr> <chr> <int> <int>
1 GU Guam 66 2020-03-16 3 0
2 GU Guam 66 2020-03-17 3 0
3 GU Guam 66 2020-03-18 5 0
4 GU Guam 66 2020-03-19 12 0
5 GU Guam 66 2020-03-20 14 0
6 GU Guam 66 2020-03-21 15 0
另一答案
类似于上面ORStudent的回答,但有额外的步骤,并且也不会删除它不能嵌套的任何行
library(jsonlite)
library(dplyr)
url <- "https://ix.cnn.io/data/novel-coronavirus-2019-ncov/us/historical.min.json"
# Read in JSON as a list into R
url_data <- jsonlite::fromJSON(url)
# Get Actual Data From the JSON
data <- url_data$data
# Create a dummy id for the data (id is the rownumber)
data$id <- c(1:nrow(data))
# Create a dataframe to store the results of the data held in the list
list_data <- data.frame(matrix(nrow = 0, ncol = 0), stringsAsFactors = F)
# Create variable i to add to the list_data dataframe
i <- 1
# Iterate through the list of dataframe held in data$data
sapply(data$data, FUN = function(x){
temp <- as.data.frame(x, stringsAsFactors = F) #Concert list to a dataframe
if (nrow(temp) > 0){
temp$id <- i
list_data <<- bind_rows(list_data, temp) # Add rows to the botton of list_data, which is holding all the list data
}
i <<- i + 1 # Add 1 to i
})
# Merge in list data to upper level df from list on id (i)
all_data <- merge(data, list_data, all.x = T, by = "id")
head(all_data)
以上是关于将这个json文件作为数据框放入R中的主要内容,如果未能解决你的问题,请参考以下文章
如何通过单击适配器类中代码的项目中的删除按钮来删除列表视图中的项目后重新加载片段?
VSCODE 查找在文件夹或者文件中代码或定义,在文件夹中查找文件的多种方法