dplyr transmute 返回的行数少于原始数据帧
Posted
技术标签:
【中文标题】dplyr transmute 返回的行数少于原始数据帧【英文标题】:dplyr transmute returning fewer rows than the original data frame 【发布时间】:2015-04-23 09:59:32 【问题描述】:我需要获取 4 行分组数据集的摘要(基本上是围绕数据框子集中数据点集的正方形。
一个函数:
myfun <- function(F1,F2)
out <-structure(list(f2 = c(1097.81431421448, 2331.43870452636, 2154.84583430979,
1210.68973077198), f1 = c(411.462078942253, 334.070858898298,
834.761924536241, 782.569047430496)), .Names = c("f2", "f1"), row.names = c(NA,
4L), class = "data.frame")
return(out)
一个数据集:
pb2 <-
structure(list(Type = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("c",
"m", "w"), class = "factor"), Sex = structure(c(2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L
), .Label = c("f", "m"), class = "factor"), Speaker = c("1",
"1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1",
"1", "1", "1", "1", "1", "1"), Vowel = structure(c(8L, 8L, 7L,
7L, 5L, 5L, 2L, 2L, 3L, 3L, 1L, 1L, 4L, 4L, 9L, 9L, 10L, 10L,
6L, 6L), .Label = c("aa", "ae", "ah", "ao", "eh", "er", "ih",
"iy", "uh", "uw"), class = "factor"), IPA = structure(c(9L, 9L,
7L, 7L, 4L, 4L, 1L, 1L, 8L, 8L, 2L, 2L, 3L, 3L, 6L, 6L, 10L,
10L, 5L, 5L), .Label = c("\\ae", "\\as", "\\ct", "\\ef", "\\er\\hr",
"\\hs", "\\ic", "\\vt", "i", "u"), class = "factor"), F0 = c(160L,
186L, 203L, 192L, 161L, 155L, 140L, 180L, 144L, 148L, 148L, 170L,
161L, 158L, 163L, 190L, 160L, 157L, 177L, 164L), F1 = c(240L,
280L, 390L, 310L, 490L, 570L, 560L, 630L, 590L, 620L, 740L, 800L,
600L, 660L, 440L, 400L, 240L, 270L, 370L, 460L), F2 = c(2280L,
2400L, 2030L, 1980L, 1870L, 1700L, 1820L, 1700L, 1250L, 1300L,
1070L, 1060L, 970L, 980L, 1120L, 1070L, 1040L, 930L, 1520L, 1330L
), F3 = c(2850L, 2790L, 2640L, 2550L, 2420L, 2600L, 2660L, 2550L,
2620L, 2530L, 2490L, 2640L, 2280L, 2220L, 2210L, 2280L, 2150L,
2280L, 1670L, 1590L)), .Names = c("Type", "Sex", "Speaker", "Vowel",
"IPA", "F0", "F1", "F2", "F3"), row.names = c(NA, 20L), class = "data.frame")
使用 dplyr 进行总结..:
library(dplyr)
> pb %>% group_by(Type,Sex) %>% transmute(F1=myfun(F1,F2)["f1"])
Source: local data frame [1,520 x 3]
Groups: Type, Sex
Type Sex F1
1 m m <dbl[4]>
2 m m <dbl[4]>
3 m m <dbl[4]>
4 m m <dbl[4]>
5 m m <dbl[4]>
6 m m <dbl[4]>
7 m m <dbl[4]>
8 m m <dbl[4]>
9 m m <dbl[4]>
10 m m <dbl[4]>
该函数返回一个数据框列,但这些列并没有按照我预期的方式附加在一起。如何让这些值相互叠加?
【问题讨论】:
【参考方案1】:你快到了。只需unnest
你所拥有的:
library(tidyr)
pb2 %>%
group_by(Type,Sex) %>%
transmute(F1=myfun(F1,F2)["f1"]) %>%
unnest(F1)
输出:
# Source: local data frame [80 x 3]
#
# Type Sex F1
# 1 m m 411.4621
# 2 m m 334.0709
# 3 m m 834.7619
# 4 m m 782.5690
# 5 m m 411.4621
# 6 m m 334.0709
# 7 m m 834.7619
# 8 m m 782.5690
# 9 m m 411.4621
# 10 m m 334.0709
# .. ... ... ...
【讨论】:
以上是关于dplyr transmute 返回的行数少于原始数据帧的主要内容,如果未能解决你的问题,请参考以下文章
R语言dplyr包使用transmute函数生成新的数据列(删除所有原数据列)实战
DataTable.Load 显示的行数少于源 DataReader