如何将数据框输出为文本(字符串)混合值和列名R
Posted
技术标签:
【中文标题】如何将数据框输出为文本(字符串)混合值和列名R【英文标题】:How to output a dataframe as text (string) mixing values and column names R 【发布时间】:2022-01-08 19:13:55 【问题描述】:您有一个数据框,它是 ML 模型的一些性能指标的集合:
> df
# A tibble: 10 x 6
Method AUC CA F1 Precision Recall
<chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Logistic Regression 0.732 0.684 0.413 0.681 0.296
2 Naive Bayes 0.729 0.694 0.463 0.679 0.352
3 Tree 0.678 0.694 0.429 0.717 0.306
4 Neural Network 0.674 0.684 0.413 0.681 0.296
5 AdaBoost 0.654 0.681 0.418 0.66 0.306
6 CN2 rule inducer 0.651 0.681 0.403 0.674 0.287
7 kNN 0.649 0.66 0.372 0.604 0.269
8 SVM 0.64 0.691 0.44 0.686 0.324
9 SGD 0.591 0.667 0.4 0.615 0.296
10 Constant 0.5 0.625 0 0 0
输入:
structure(list(Method = c("Logistic Regression", "Naive Bayes",
"Tree", "Neural Network", "AdaBoost", "CN2 rule inducer", "kNN",
"SVM", "SGD", "Constant"), AUC = c(0.732, 0.729, 0.678, 0.674,
0.654, 0.651, 0.649, 0.64, 0.591, 0.5), CA = c(0.684, 0.694,
0.694, 0.684, 0.681, 0.681, 0.66, 0.691, 0.667, 0.625), F1 = c(0.413,
0.463, 0.429, 0.413, 0.418, 0.403, 0.372, 0.44, 0.4, 0), Precision = c(0.681,
0.679, 0.717, 0.681, 0.66, 0.674, 0.604, 0.686, 0.615, 0), Recall = c(0.296,
0.352, 0.306, 0.296, 0.306, 0.287, 0.269, 0.324, 0.296, 0)), row.names = c(NA,
-10L), class = c("tbl_df", "tbl", "data.frame"))
我需要在 Excel 中将其合并到一行中,但是每行复制每个列名很累人。所以我想把所有东西都作为一个字符串(或字符串列表)说:[Model name]: Col1_name Col1 value, Col2_name value2, ..., 等等。像这样的:
`Logistic Regression: AUC 0.732, CA 0.684, F1 0.413, Precision 0.681, Recall 0.296
Naive Bayes: AUC 0.729, CA 0.694, F1 0.463, Precision 0.679, Recall 0.352
Tree ... (and so on).`
一行也可以:
Logistic Regression: AUC 0.732, CA 0.684, F1 0.413, Precision 0.681, Recall 0.296 Naive Bayes: AUC 0.729, CA 0.694, F1 0.463, Precision 0.679, Recall 0.352 Tree ... (and so on)
但我不知道如何在每个值之前添加每个列名。我将不胜感激!
【问题讨论】:
【参考方案1】:也许这对你有用。
Output <- apply(df, 1, function(x)
gsub(' AUC', ': AUC', paste(paste(names(x), x), collapse = ' '))
)
这里我假设 AUC 将始终是数据集中的第二列。如果没有,您可以相应地更改它。
【讨论】:
太完美了!我想不出使用 apply 的方法。【参考方案2】:这与您正在寻找的内容接近吗?
my_df <- structure(list(Method = c("Logistic Regression", "Naive Bayes",
"Tree", "Neural Network", "AdaBoost", "CN2 rule inducer", "kNN",
"SVM", "SGD", "Constant"),
AUC = c(0.732, 0.729, 0.678, 0.674, 0.654, 0.651, 0.649, 0.64, 0.591, 0.5),
CA = c(0.684, 0.694, 0.694, 0.684, 0.681, 0.681, 0.66, 0.691, 0.667, 0.625),
F1 = c(0.413, 0.463, 0.429, 0.413, 0.418, 0.403, 0.372, 0.44, 0.4, 0),
Precision = c(0.681, 0.679, 0.717, 0.681, 0.66, 0.674, 0.604, 0.686, 0.615, 0),
Recall = c(0.296, 0.352, 0.306, 0.296, 0.306, 0.287, 0.269, 0.324, 0.296, 0)),
row.names = c(NA, -10L), class = c("tbl_df", "tbl", "data.frame"))
my_df$Solution <- paste(my_df$Method, ":",
colnames(my_df)[2], my_df$AUC, ",",
colnames(my_df)[3], my_df$CA, ",",
colnames(my_df)[4], my_df$F1, ",",
colnames(my_df)[5], my_df$Precision, ",",
colnames(my_df)[6], my_df$Recall, ",")
【讨论】:
这可行。但我一直在寻找更自动化的东西,以防我对更宽的数据框有同样的问题。以上是关于如何将数据框输出为文本(字符串)混合值和列名R的主要内容,如果未能解决你的问题,请参考以下文章