R:按此顺序对我的数据框的最小值进行排序
Posted
技术标签:
【中文标题】R:按此顺序对我的数据框的最小值进行排序【英文标题】:R: Sort the Minimums of My Data Frame in This Order 【发布时间】:2021-10-09 02:21:30 【问题描述】:我有如下数据框:
library(future.apply)
lb <- 2:9
NBB_AR0.8 <- c(1.879, 1.065, 1.385, 1.568, 1.493, 1.732, 1.263, 0.9369)
NBB_AR0.9 <- c(0.8051, 0.7598, 1.113, 1.056, 0.9819, 0.8842, 0.679, 0.4441)
NBB_AR0.95 <- c(0.7456, 1.249, 0.8531, 1.573, 1.425, 1.181, 0.8645, 0.5171)
MBB1_AR0.8 <- c(1.806, 1.611, 1.199, 1.46, 1.253, 1.483, 1.418,1.615)
MBB1_AR0.9 <- c(0.7936, 0.7351, 0.9151, 0.9417, 0.9827, 0.9767, 0.8699, 0.9629)
MBB1_AR0.95 <- c(1.646, 1.621, 0.9941, 0.9044, 1.054, 1.247, 1.376, 1.281)
MBB2_AR0.8 <- c(1.806, 1.611, 1.199, 1.46, 1.619, 1.483, 1.498, 1.301)
MBB2_AR0.9 <- c(0.7936, 0.7351, 0.9151, 0.9417, 0.9653, 0.9767, 1.051, 0.9979)
MBB2_AR0.95 <- c(1.646, 1.621, 0.9941, 0.9044, 1.531, 1.247, 1.03, 0.9696)
MBB3_AR0.8 <- c(1.806, 1.611, 1.199, 1.46, 1.363, 1.483, 1.742, 1.161)
MBB3_AR0.9 <- c(0.7936, 0.7351, 0.9151, 0.9417, 1.025, 0.9767, 0.9018, 0.6612)
MBB3_AR0.95 <- c(1.646, 1.621, 0.9941, 0.9044, 0.861, 1.247, 1.184, 0.8825)
CBB_AR0.8 <- c(1.642, 0.9616, 1.42, 1.728, 1.326, 1.324, 1.542, 1.172)
CBB_AR0.9 <- c(0.2077, 0.2158, 0.1791, 0.1933, 0.168, 0.2211, 0.1516, 0.2133)
CBB_AR0.95 <- c(0.1039, 0.08983, 0.09176, 0.1, 0.09203, 0.08383, 0.08386, 0.08956)
df <- data.frame(lb, NBB_AR0.8, NBB_AR0.9, NBB_AR0.95, NBB_AR0.95, MBB1_AR0.8, MBB1_AR0.9, MBB1_AR0.95, MBB2_AR0.8, MBB2_AR0.9, MBB2_AR0.95, MBB3_AR0.8, MBB3_AR0.9, MBB3_AR0.95, CBB_AR0.8, CBB_AR0.9, CBB_AR0.95)
向量NBB_AR0.8的最小值是min(NBB_AR0.8) = 0.9369
向量NBB_AR0.9的最小值是min(NBB_AR0.9) = 0.4441
向量NBB_AR0.95的最小值为min(NBB_AR0.95) = 0.5171
以上三(3)个都有NBB
,所以应该排在NBB
的那一行
min(NBB_AR0.8) = 0.9369
向量MBB1_AR0.8的最小值为min(MBB1_AR0.8) = 1.199
向量MBB2_AR0.8的最小值为min(MBB2_AR0.8) = 1.199
向量MBB3_AR0.8的最小值是min(MBB3_AR0.8) = 1.161
向量CBB_AR0.8的最小值是min(CBB_AR0.8) = 0.9616
以上五(5)个都有AR0.8
,所以应该排在AR0.8
的那一行
其他的按照同样的安排。
我希望使用R
将最小值排列如下:
AR0.8 | AR0.9 | AR0.95 | |
---|---|---|---|
NBB | 0.9369 | 0.4441 | 0.5171 |
MBB1 | 1.199 | 0.7351 | 0.9044 |
MBB2 | 1.199 | 0.7351 | 0.9044 |
MBB3 | 1.161 | 0.6612 | 0.861 |
CBB | 0.9616 | 0.1516 | 0.08336 |
我尝试了这个,但得到的结果不符合我在安排上的期望:
future.apply::future_apply(df[-1], 2, min)
> NBB_AR0.8 NBB_AR0.9 NBB_AR0.95 NBB_AR0.95.1 MBB1_AR0.8 MBB1_AR0.9 MBB1_AR0.95 MBB2_AR0.8 MBB2_AR0.9 MBB2_AR0.95 MBB3_AR0.8
0.93690 0.44410 0.51710 0.51710 1.19900 0.73510 0.90440 1.19900 0.73510 0.90440 1.16100
MBB3_AR0.9 MBB3_AR0.95 CBB_AR0.8 CBB_AR0.9 CBB_AR0.95
0.66120 0.86100 0.96160 0.15160 0.08383
答案是正确的,但我对安排也很感兴趣。
我也对这个方法感兴趣:
future.apply::future_apply(df[-1], 2, which.min)
这给了我这个:
NBB_N10_AR0.8_RMSE NBB_N10_AR0.9_RMSE NBB_N10_AR0.95_RMSE NBB_N10_AR0.95_RMSE.1 MBB1_N10_AR0.8_RMSE MBB1_N10_AR0.9_RMSE 8 8 8 8 3 2 MBB1_N10_AR0.95_RMSE MBB2_N10_AR0.8_RMSE MBB2_N10_AR0.9_RMSE MBB2_N10_AR0.95_RMSE MBB3_N10_AR0.8_RMSE MBB3_N10_AR0.9_RMSE 4 3 2 4 8 8 MBB3_N10_AR0.95_RMSE CBB_N10_AR0.8_RMSE CBB_N10_AR0.9_RMSE CBB_N10_AR0.95_RMSE 5 2 7 6
我希望它被安排成这张桌子:
AR0.8 | AR0.9 | AR0.95 | |
---|---|---|---|
NBB | 9 | 9 | 9 |
MBB1 | 4 | 3 | 5 |
MBB2 | 4 | 3 | 5 |
MBB3 | 9 | 9 | 6 |
CBB | 3 | 8 | 8 |
min(NBB_AR0.8) = 0.9369
在lb = 9
下
向量NBB_AR0.9的最小值是min(NBB_AR0.9) = 0.4441
在lb = 9
下
向量NBB_AR0.95的最小值是min(NBB_AR0.95) = 0.5171
在lb = 9
下
以上三(3)个都有NBB
,所以应该排在NBB
的那一行
min(NBB_AR0.8) = 0.9369
在lb = 9
下
向量MBB1_AR0.8的最小值是min(MBB1_AR0.8) = 1.199
在lb = 4
下
向量MBB2_AR0.8的最小值是min(MBB2_AR0.8) = 1.199
在lb = 4
下
向量MBB3_AR0.8的最小值是min(MBB3_AR0.8) = 1.161
在lb = 9
下
向量CBB_AR0.8的最小值是min(CBB_AR0.8) = 0.9616
在lb = 3
下
以上五(5)个都有AR0.8
,所以应该排在AR0.8
的那一行
【问题讨论】:
请帮我解决这个问题future.apply::future_apply(df[-1], 2, which.min)
【参考方案1】:
我们可能会使用
lst1 <- split(setNames(out, sub(".*_", "", names(out))), sub("_.*", "", names(out)))
do.call(rbind, lapply(lst1, function(x) x[!duplicated(x)]))
-输出
AR0.8 AR0.9 AR0.95
CBB 0.9616 0.1516 0.08383
MBB1 1.1990 0.7351 0.90440
MBB2 1.1990 0.7351 0.90440
MBB3 1.1610 0.6612 0.86100
NBB 0.9369 0.4441 0.51710
lst2 <- split(setNames(out2, sub(".*_", "", names(out2))), sub("_.*", "", names(out2)))
do.call(rbind, lapply(lst2, `[`, 1:3))
AR0.8 AR0.9 AR0.95
CBB 2 7 6
MBB1 3 2 4
MBB2 3 2 4
MBB3 8 8 5
NBB 8 8 8
数据
out <- future.apply::future_apply(df[-1], 2, min)
out2 <- future.apply::future_apply(df[-1], 2, which.min)
【讨论】:
@DanielJames 抱歉,我的意思是out
是 future_apply
的输出
@DanielJames 你还有其他模式吗
尝试将strsplit(str1, "_(?=[A-Z0-9]+\\.)", perl = TRUE)
放入str1 <- c("NBB_AR0.8", "NBB_N10_AR0.8_RMSE", "NBB_N10_AR0.9_RMSE", "NBB_N10_AR0.95_RMSE", "MBB1_N10_AR0.8_RMSE", "MBB1_N10_AR0.9_RMSE", "MBB1_N10_AR0.95_RMSE", "MBB2_N10_AR0.8_RMSE", "MBB2_N10_AR0.9_RMSE", "MBB2_N10_AR0.95_RMSE", "MBB3_N10_AR0.8_RMSE", "MBB3_N10_AR0.9_RMSE", "MBB3_N10_AR0.95_RMSE", "CBB_N10_AR0.8_RMSE", "CBB_N10_AR0.9_RMSE", "CBB_N10_AR0.95_RMSE" )
的列表中
strsplit
的输出是一个列表。你可以做m1 <- do.call(rbind, strsplit(str1, "_(?=[A-Z0-9]+\\.)", perl = TRUE));m1[,1];m1[,2]
来获取这两个组件
您可以通过m1[,1]
或m1[,2]
拆分,然后用m1[,2]
更改名称【参考方案2】:
tidyverse
解决方案可能是
library(tidyr)
library(dplyr)
df %>%
pivot_longer(-c(lb), names_to = c("name", "name2"), names_pattern = "(.*)_(.*)") %>%
select(-lb) %>%
group_by(name, name2) %>%
slice_min(value) %>%
pivot_wider(names_from = name2) %>%
ungroup()
返回
# A tibble: 5 x 4
name AR0.8 AR0.9 AR0.95
<chr> <dbl> <dbl> <dbl>
1 CBB 0.962 0.152 0.0838
2 MBB1 1.20 0.735 0.904
3 MBB2 1.20 0.735 0.904
4 MBB3 1.16 0.661 0.861
5 NBB 0.937 0.444 0.517
【讨论】:
以上是关于R:按此顺序对我的数据框的最小值进行排序的主要内容,如果未能解决你的问题,请参考以下文章