将嵌套列表中的元素转换为数据框

Posted

技术标签:

【中文标题】将嵌套列表中的元素转换为数据框【英文标题】:Converting elements in a nested list to dataframe [duplicate] 【发布时间】:2015-03-11 22:11:51 【问题描述】:

我有如下嵌套列表,

 dput( list(structure(c("123.60", " on"))))

我有兴趣将此嵌套列表中的元素转换为数据框。 例如,输出应如下所示。

      code      description      
      123.60    not stated as uncontrolled, with neurological manifestations
      123.50    not stated as uncontrolled, with ophthalmic manifestations
      .
      .
      .
      123.52    uncontrolled, with ophthalmic manifestations 

需要帮助将这些元素转换为数据框。

【问题讨论】:

尝试使用dput 以使您的问题可重现 @DavidArenburg,谢谢大卫,我现在做到了。 @Science11,您的意思是在编辑问题时删除dput 的输出吗? 【参考方案1】:

这不完全是一个 嵌套 列表,而是一个命名字符向量的列表。您可以将as.data.frame.list 应用于每个元素,然后使用rbind。所以如果x 是你的列表,那么

df <- do.call(rbind, lapply(x, as.data.frame.list, stringsAsFactors = FALSE))
## below is optional - converts character columns to appropriate type
## but will also convert some columns back to factors again
df[] <- lapply(df, type.convert) 
df
#      code                                                   description codeSystem codeSystemVersion
# 1  123.60  not stated as uncontrolled, with neurological manifestations     XAZ9CM       XAZ9CM-2012
# 2  123.50    not stated as uncontrolled, with ophthalmic manifestations     XAZ9CM       XAZ9CM-2012
# 3  123.61  not stated as uncontrolled, with neurological manifestations     XAZ9CM       XAZ9CM-2012
# 4   123.7                              peripheral circulatory disorders     XAZ9CM       XAZ9CM-2012
# 5  123.40         not stated as uncontrolled, with renal manifestations     XAZ9CM       XAZ9CM-2012
# 6  123.41         not stated as uncontrolled, with renal manifestations     XAZ9CM       XAZ9CM-2012
# 7   123.5                                     ophthalmic manifestations     XAZ9CM       XAZ9CM-2012
# 8  123.53                  uncontrolled, with ophthalmic manifestations     XAZ9CM       XAZ9CM-2012
# 9  123.52                  uncontrolled, with ophthalmic manifestations     XAZ9CM       XAZ9CM-2012
# 10  123.4                                          renal manifestations     XAZ9CM       XAZ9CM-2012

更新:你也可以这样做

data.frame(do.call(rbind, x), stringsAsFactors=FALSE)

其他可能更有效的可能性包括

library(data.table)
rbindlist(lapply(x, as.list))

library(dplyr)
bind_rows(lapply(x, as.data.frame.list, stringsAsFactors=FALSE))

和(感谢 Ananda Mahto)

library(stringi)
data.frame(stri_list2matrix(x, byrow=TRUE), stringsAsFactors=FALSE)

如果您希望第一列是数字,所有这些仍然需要对第一列进行类型转换。

另外,这个问题的数据似乎已经消失了,所以在这里,从编辑历史中复制。

 x <- list(structure(c("123.60", " not stated as uncontrolled, with neurological manifestations",                 
     "XAZ9CM", "XAZ9CM-2012"), .Names = c("code", "description", "codeSystem",                                    
     "codeSystemVersion")), structure(c("123.50", " not stated as uncontrolled, with ophthalmic manifestations",  
     "XAZ9CM", "XAZ9CM-2012"), .Names = c("code", "description", "codeSystem",                                    
     "codeSystemVersion")), structure(c("123.61", "not stated as uncontrolled, with neurological manifestations", 
     "XAZ9CM", "XAZ9CM-2012"), .Names = c("code", "description", "codeSystem",                                    
     "codeSystemVersion")), structure(c("123.7", "peripheral circulatory disorders",                              
     "XAZ9CM", "XAZ9CM-2012"), .Names = c("code", "description", "codeSystem",                                    
     "codeSystemVersion")), structure(c("123.40", " not stated as uncontrolled, with renal manifestations",       
     "XAZ9CM", "XAZ9CM-2012"), .Names = c("code", "description", "codeSystem",                                    
     "codeSystemVersion")), structure(c("123.41", " not stated as uncontrolled, with renal manifestations",       
     "XAZ9CM", "XAZ9CM-2012"), .Names = c("code", "description", "codeSystem",                                    
     "codeSystemVersion")), structure(c("123.5", "ophthalmic manifestations",                                     
     "XAZ9CM", "XAZ9CM-2012"), .Names = c("code", "description", "codeSystem",                                    
     "codeSystemVersion")), structure(c("123.53", "uncontrolled, with ophthalmic manifestations",                 
     "XAZ9CM", "XAZ9CM-2012"), .Names = c("code", "description", "codeSystem",                                    
     "codeSystemVersion")), structure(c("123.52", " uncontrolled, with ophthalmic manifestations",                
     "XAZ9CM", "XAZ9CM-2012"), .Names = c("code", "description", "codeSystem",                                    
     "codeSystemVersion")), structure(c("123.4", "renal manifestations",                                          
     "XAZ9CM", "XAZ9CM-2012"), .Names = c("code", "description", "codeSystem",                                    
     "codeSystemVersion")))   

【讨论】:

@Science11 - 好!我添加了type.convert 以确保数字列以数字结尾,而不是字符 以前从未听说过type.convert... @DavidArenburg - 非常简洁。它尝试将字符向量转换为适当的类型。实际上,现在我考虑一下,这里可能不合适,因为它只是再次将字符串转换回因子。哈哈 是的,谢谢,我知道如何阅读文档 :) 你在哪里找到这个宝石? @DavidArenburg - 实际上我第一次看到它是在 Ananda Mahto 的代码中

以上是关于将嵌套列表中的元素转换为数据框的主要内容,如果未能解决你的问题,请参考以下文章

在 R 中使用混合类型将嵌套列表中的字符数字转换为数字

将字符串转换为数据框中的列表

如何将包含列表的列转换为熊猫数据框中的单独列? [复制]

如何使用pyspark将具有多个可能值的Json数组列表转换为数据框中的列

将带有坐标的json文件嵌套到R中的数据框中

将数组和元组元素转换为 Pandas 数据框中的列 [重复]