R:使用“传播”功能进行旋转
Posted
技术标签:
【中文标题】R:使用“传播”功能进行旋转【英文标题】:R: Pivoting using 'spread' function 【发布时间】:2015-07-09 06:10:45 【问题描述】:从我之前的post 继续,我现在还有 1 列 ID 值,我需要使用这些值将行转换为列。
NUM <- c(1,2,3,1,2,3,1,2,3,1)
ID <- c("DJ45","DJ45","DJ45","DJ46","DJ46","DJ46","DJ47","DJ47","DJ47","DJ48")
Type <- c("A", "F", "C", "B", "D", "A", "E", "C", "F", "D")
Points <- c(9.2,60.8,22.9,1012.7,18.7,11.1,67.2,63.1,16.7,58.4)
df1 <- data.frame(ID,NUM,Type,Points)
df1:
+------+-----+------+--------+
| ID | Num | Type | Points |
+------+-----+------+--------+
| DJ45 | 1 | A | 9.2 |
| DJ45 | 2 | F | 60.8 |
| DJ45 | 3 | C | 22.9 |
| DJ46 | 1 | B | 1012.7 |
| DJ46 | 2 | D | 18.7 |
| DJ46 | 3 | A | 11.1 |
| DJ47 | 1 | E | 67.2 |
| DJ47 | 2 | C | 63.1 |
| DJ47 | 3 | F | 16.7 |
| DJ48 | 1 | D | 58.4 |
+------+-----+------+--------+
我想要的输出是
+------+-----+------+--------+------+------+------+------+
| ID | Num | A | B | C | D | E | F |
+------+-----+------+--------+------+------+------+------+
| DJ45 | 1 | 9.2 | N/A | N/A | N/A | N/A | N/A |
| DJ45 | 2 | N/A | N/A | N/A | N/A | N/A | 60.8 |
| DJ45 | 3 | N/A | N/A | 22.9 | N/A | N/A | N/A |
| DJ46 | 1 | N/A | 1012.7 | N/A | N/A | N/A | N/A |
| DJ46 | 2 | N/A | N/A | N/A | 18.7 | N/A | N/A |
| DJ46 | 3 | 11.1 | N/A | N/A | N/A | N/A | N/A |
| DJ47 | 1 | N/A | N/A | N/A | N/A | 67.2 | N/A |
| DJ47 | 2 | N/A | N/A | 63.1 | N/A | N/A | N/A |
| DJ47 | 3 | N/A | N/A | N/A | N/A | N/A | 16.7 |
| DJ48 | 1 | N/A | N/A | N/A | 58.4 | N/A | N/A |
+------+-----+------+--------+------+------+------+------+
我在 R 中使用 spread
函数,但收到错误提示重复标识符。这是因为我现在有 2 列(ID 和 NUM),而不是之前的 1 列(NUM)。请让我知道我该怎么做。
【问题讨论】:
【参考方案1】:不知道你尝试了什么,我建议:
spread(df1, Type, Points)
# ID NUM A B C D E F
# 1 DJ45 1 9.2 NA NA NA NA NA
# 2 DJ45 2 NA NA NA NA NA 60.8
# 3 DJ45 3 NA NA 22.9 NA NA NA
# 4 DJ46 1 NA 1012.7 NA NA NA NA
# 5 DJ46 2 NA NA NA 18.7 NA NA
# 6 DJ46 3 11.1 NA NA NA NA NA
# 7 DJ47 1 NA NA NA NA 67.2 NA
# 8 DJ47 2 NA NA 63.1 NA NA NA
# 9 DJ47 3 NA NA NA NA NA 16.7
# 10 DJ48 1 NA NA NA 58.4 NA NA
如果您收到有关重复标识符的错误,那是因为您的实际数据中的“ID”和“Num”组合有一个或多个重复条目(在您的示例数据中,它们没有)。
如果是这种情况,您需要添加另一列以使其唯一。
将dplyr
添加到链中,可能类似于:
df1 %>%
group_by(ID, NUM) %>%
mutate(id2 = sequence(n())) %>%
spread(Type, Points)
假设错误的演示:
df2 <- rbind(df1, df1[1:3, ]) ## Duplicate the first three rows
spread(df2, Type, Points)
# Error: Duplicate identifiers for rows (1, 11), (3, 13), (2, 12)
library(dplyr)
df2 %>%
group_by(ID, NUM) %>%
mutate(id2 = sequence(n())) %>%
spread(Type, Points)
# Source: local data frame [13 x 9]
#
# ID NUM id2 A B C D E F
# 1 DJ45 1 1 9.2 NA NA NA NA NA
# 2 DJ45 1 2 9.2 NA NA NA NA NA
# 3 DJ45 2 1 NA NA NA NA NA 60.8
# 4 DJ45 2 2 NA NA NA NA NA 60.8
# 5 DJ45 3 1 NA NA 22.9 NA NA NA
# 6 DJ45 3 2 NA NA 22.9 NA NA NA
# 7 DJ46 1 1 NA 1012.7 NA NA NA NA
# 8 DJ46 2 1 NA NA NA 18.7 NA NA
# 9 DJ46 3 1 11.1 NA NA NA NA NA
# 10 DJ47 1 1 NA NA NA NA 67.2 NA
# 11 DJ47 2 1 NA NA 63.1 NA NA NA
# 12 DJ47 3 1 NA NA NA NA NA 16.7
# 13 DJ48 1 1 NA NA NA 58.4 NA NA
【讨论】:
看来你不喜欢row_number()
作为sequence(n())
的dplyr 替代品..?
@docendodiscimus,我必须记住的另一个功能 :-) 当我犯了这些错误时,请随意编辑!
@AnandaMahto 工作得很好。正如我所期望的那样:)以上是关于R:使用“传播”功能进行旋转的主要内容,如果未能解决你的问题,请参考以下文章
R语言psych包的fa函数对指定数据集进行因子分析(输入数据为相关性矩阵)指定进行正交旋转斜交旋转提取因子比较正交旋转和斜交旋转之间的差异因子结构矩阵因子模式矩阵和因子相关矩阵之间的关系
R语言使用psych包的fa函数对指定数据集进行因子分析(输入数据为相关性矩阵)使用rotate参数指定进行正交旋转提取因子使用nfactors参数指定抽取的因子个数fa函数因子分析结果解读
R语言使用psych包的fa函数对指定数据集进行因子分析(输入数据为相关性矩阵)使用rotate参数指定进行斜交旋转提取因子使用nfactors参数指定抽取的因子个数fa函数因子分析结果解读
R语言使用magick包的image_rotate函数image_flip函数image_flop函数对图像进行缩放旋转镜像翻转(Rotate or mirror the image)