R:按此顺序对我的数据框的最小值进行排序

Posted

技术标签:

【中文标题】R:按此顺序对我的数据框的最小值进行排序【英文标题】:R: Sort the Minimums of My Data Frame in This Order 【发布时间】:2021-10-09 02:21:30 【问题描述】:

我有如下数据框:

library(future.apply)
lb <- 2:9
NBB_AR0.8 <- c(1.879, 1.065, 1.385, 1.568, 1.493, 1.732, 1.263, 0.9369)
NBB_AR0.9 <- c(0.8051, 0.7598, 1.113, 1.056, 0.9819, 0.8842, 0.679, 0.4441)
NBB_AR0.95 <- c(0.7456, 1.249, 0.8531, 1.573, 1.425, 1.181, 0.8645, 0.5171)
MBB1_AR0.8 <- c(1.806, 1.611, 1.199, 1.46, 1.253, 1.483, 1.418,1.615)
MBB1_AR0.9 <- c(0.7936, 0.7351, 0.9151, 0.9417, 0.9827, 0.9767, 0.8699, 0.9629)
MBB1_AR0.95 <- c(1.646, 1.621, 0.9941, 0.9044, 1.054, 1.247, 1.376, 1.281)
MBB2_AR0.8 <- c(1.806, 1.611, 1.199, 1.46, 1.619, 1.483, 1.498, 1.301)
MBB2_AR0.9 <- c(0.7936, 0.7351, 0.9151, 0.9417, 0.9653, 0.9767, 1.051, 0.9979)
MBB2_AR0.95 <- c(1.646, 1.621, 0.9941, 0.9044, 1.531, 1.247, 1.03, 0.9696)
MBB3_AR0.8 <- c(1.806, 1.611, 1.199, 1.46, 1.363, 1.483, 1.742, 1.161)
MBB3_AR0.9 <- c(0.7936, 0.7351, 0.9151, 0.9417, 1.025, 0.9767, 0.9018, 0.6612)
MBB3_AR0.95 <- c(1.646, 1.621, 0.9941, 0.9044, 0.861, 1.247, 1.184, 0.8825)
CBB_AR0.8 <- c(1.642, 0.9616, 1.42, 1.728, 1.326, 1.324, 1.542, 1.172)
CBB_AR0.9 <- c(0.2077, 0.2158, 0.1791, 0.1933, 0.168, 0.2211, 0.1516, 0.2133)
CBB_AR0.95 <- c(0.1039, 0.08983, 0.09176, 0.1, 0.09203, 0.08383, 0.08386, 0.08956) 
df <- data.frame(lb, NBB_AR0.8, NBB_AR0.9, NBB_AR0.95, NBB_AR0.95, MBB1_AR0.8, MBB1_AR0.9, MBB1_AR0.95, MBB2_AR0.8, MBB2_AR0.9, MBB2_AR0.95, MBB3_AR0.8, MBB3_AR0.9, MBB3_AR0.95, CBB_AR0.8, CBB_AR0.9, CBB_AR0.95)
向量NBB_AR0.8的最小值是min(NBB_AR0.8) = 0.9369 向量NBB_AR0.9的最小值是min(NBB_AR0.9) = 0.4441 向量NBB_AR0.95的最小值为min(NBB_AR0.95) = 0.5171

以上三(3)个都有NBB,所以应该排在NBB的那一行

向量NBB_AR0.8的最小值为min(NBB_AR0.8) = 0.9369 向量MBB1_AR0.8的最小值为min(MBB1_AR0.8) = 1.199 向量MBB2_AR0.8的最小值为min(MBB2_AR0.8) = 1.199 向量MBB3_AR0.8的最小值是min(MBB3_AR0.8) = 1.161 向量CBB_AR0.8的最小值是min(CBB_AR0.8) = 0.9616

以上五(5)个都有AR0.8,所以应该排在AR0.8的那一行 其他的按照同样的安排。

我希望使用R 将最小值排列如下:

AR0.8 AR0.9 AR0.95
NBB 0.9369 0.4441 0.5171
MBB1 1.199 0.7351 0.9044
MBB2 1.199 0.7351 0.9044
MBB3 1.161 0.6612 0.861
CBB 0.9616 0.1516 0.08336

我尝试了这个,但得到的结果不符合我在安排上的期望:

    future.apply::future_apply(df[-1], 2, min)

> NBB_AR0.8    NBB_AR0.9   NBB_AR0.95 NBB_AR0.95.1   MBB1_AR0.8   MBB1_AR0.9  MBB1_AR0.95   MBB2_AR0.8   MBB2_AR0.9  MBB2_AR0.95   MBB3_AR0.8 
     0.93690      0.44410      0.51710      0.51710      1.19900      0.73510      0.90440      1.19900      0.73510      0.90440      1.16100 
  MBB3_AR0.9  MBB3_AR0.95    CBB_AR0.8    CBB_AR0.9   CBB_AR0.95 
     0.66120      0.86100      0.96160      0.15160      0.08383 


答案是正确的,但我对安排也很感兴趣。

我也对这个方法感兴趣:

future.apply::future_apply(df[-1], 2, which.min)

这给了我这个:

NBB_N10_AR0.8_RMSE NBB_N10_AR0.9_RMSE NBB_N10_AR0.95_RMSE NBB_N10_AR0.95_RMSE.1 MBB1_N10_AR0.8_RMSE MBB1_N10_AR0.9_RMSE 8 8 8 8 3 2 MBB1_N10_AR0.95_RMSE MBB2_N10_AR0.8_RMSE MBB2_N10_AR0.9_RMSE MBB2_N10_AR0.95_RMSE MBB3_N10_AR0.8_RMSE MBB3_N10_AR0.9_RMSE 4 3 2 4 8 8 MBB3_N10_AR0.95_RMSE CBB_N10_AR0.8_RMSE CBB_N10_AR0.9_RMSE CBB_N10_AR0.95_RMSE 5 2 7 6

我希望它被安排成这张桌子:

AR0.8 AR0.9 AR0.95
NBB 9 9 9
MBB1 4 3 5
MBB2 4 3 5
MBB3 9 9 6
CBB 3 8 8
向量NBB_AR0.8的最小值是min(NBB_AR0.8) = 0.9369lb = 9下 向量NBB_AR0.9的最小值是min(NBB_AR0.9) = 0.4441lb = 9下 向量NBB_AR0.95的最小值是min(NBB_AR0.95) = 0.5171lb = 9

以上三(3)个都有NBB,所以应该排在NBB的那一行

向量NBB_AR0.8的最小值是min(NBB_AR0.8) = 0.9369lb = 9下 向量MBB1_AR0.8的最小值是min(MBB1_AR0.8) = 1.199lb = 4下 向量MBB2_AR0.8的最小值是min(MBB2_AR0.8) = 1.199lb = 4下 向量MBB3_AR0.8的最小值是min(MBB3_AR0.8) = 1.161lb = 9下 向量CBB_AR0.8的最小值是min(CBB_AR0.8) = 0.9616lb = 3

以上五(5)个都有AR0.8,所以应该排在AR0.8的那一行

【问题讨论】:

请帮我解决这个问题future.apply::future_apply(df[-1], 2, which.min) 【参考方案1】:

我们可能会使用

lst1 <- split(setNames(out, sub(".*_", "", names(out))),  sub("_.*", "", names(out)))
do.call(rbind, lapply(lst1, function(x) x[!duplicated(x)]))

-输出

    AR0.8  AR0.9  AR0.95
CBB  0.9616 0.1516 0.08383
MBB1 1.1990 0.7351 0.90440
MBB2 1.1990 0.7351 0.90440
MBB3 1.1610 0.6612 0.86100
NBB  0.9369 0.4441 0.51710

lst2 <- split(setNames(out2, sub(".*_", "", names(out2))),  sub("_.*", "", names(out2)))
 do.call(rbind, lapply(lst2, `[`, 1:3))
     AR0.8 AR0.9 AR0.95
CBB      2     7      6
MBB1     3     2      4
MBB2     3     2      4
MBB3     8     8      5
NBB      8     8      8

数据

out <- future.apply::future_apply(df[-1], 2, min)
out2 <- future.apply::future_apply(df[-1], 2, which.min)

【讨论】:

@DanielJames 抱歉,我的意思是 outfuture_apply 的输出 @DanielJames 你还有其他模式吗 尝试将strsplit(str1, "_(?=[A-Z0-9]+\\.)", perl = TRUE) 放入str1 &lt;- c("NBB_AR0.8", "NBB_N10_AR0.8_RMSE", "NBB_N10_AR0.9_RMSE", "NBB_N10_AR0.95_RMSE", "MBB1_N10_AR0.8_RMSE", "MBB1_N10_AR0.9_RMSE", "MBB1_N10_AR0.95_RMSE", "MBB2_N10_AR0.8_RMSE", "MBB2_N10_AR0.9_RMSE", "MBB2_N10_AR0.95_RMSE", "MBB3_N10_AR0.8_RMSE", "MBB3_N10_AR0.9_RMSE", "MBB3_N10_AR0.95_RMSE", "CBB_N10_AR0.8_RMSE", "CBB_N10_AR0.9_RMSE", "CBB_N10_AR0.95_RMSE" ) 的列表中 strsplit 的输出是一个列表。你可以做m1 &lt;- do.call(rbind, strsplit(str1, "_(?=[A-Z0-9]+\\.)", perl = TRUE));m1[,1];m1[,2]来获取这两个组件 您可以通过m1[,1]m1[,2]拆分,然后用m1[,2]更改名称【参考方案2】:

tidyverse 解决方案可能是

library(tidyr)
library(dplyr)

df %>% 
  pivot_longer(-c(lb), names_to = c("name", "name2"), names_pattern = "(.*)_(.*)") %>% 
  select(-lb) %>% 
  group_by(name, name2) %>% 
  slice_min(value) %>% 
  pivot_wider(names_from = name2) %>% 
  ungroup()

返回

# A tibble: 5 x 4
  name  AR0.8 AR0.9 AR0.95
  <chr> <dbl> <dbl>  <dbl>
1 CBB   0.962 0.152 0.0838
2 MBB1  1.20  0.735 0.904 
3 MBB2  1.20  0.735 0.904 
4 MBB3  1.16  0.661 0.861 
5 NBB   0.937 0.444 0.517 

【讨论】:

以上是关于R:按此顺序对我的数据框的最小值进行排序的主要内容,如果未能解决你的问题,请参考以下文章

R中只有正值的数据框的回归

如何按常用的组合框排序?

如何使用 row.names 属性在 R 中对数据框的行进行排序?

按行查找矩阵或数据框的最小值(排序)

R中具有数据框的每一行的最小值和最大值

如何按字母顺序对数据框的行进行排序? [复制]