使用sklearn获得精度和召回

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了使用sklearn获得精度和召回相关的知识,希望对你有一定的参考价值。

使用下面的代码,我有Accuracy。现在我想

1)找到每个折叠的precisionrecall(总共10倍)

2)获得meanprecision

3)获得meanrecall

这可能类似于下面的print(scores)print("Accuracy: %0.2f (+/- %0.2f)" % (scores.mean(), scores.std() * 2))

有什么想法吗?

import numpy as np
from sklearn import cross_validation
from sklearn import datasets
from sklearn import svm
from sklearn.model_selection import StratifiedKFold

iris = datasets.load_iris()
skf = StratifiedKFold(n_splits=10)
clf = svm.SVC(kernel='linear', C=1)
scores = cross_validation.cross_val_score(clf, iris.data, iris.target, cv=10)
print(scores)  #[ 1. 0.93333333   1.  1. 0.86666667  1.  0.93333333   1.  1.  1.]
print("Accuracy: %0.2f (+/- %0.2f)" % (scores.mean(), scores.std() * 2)) # Accuracy: 0.97 (+/- 0.09)
答案

这有点不同,因为cross_val_score无法计算非二进制分类的精度/召回率,因此您需要使用recision_score,recall_score并手动进行交叉验证。参数average ='micro'计算全局精度/召回率。

import numpy as np
from sklearn import cross_validation
from sklearn import datasets
from sklearn import svm
from sklearn.model_selection import StratifiedKFold
from sklearn.metrics import precision_score, recall_score

iris = datasets.load_iris()
skf = StratifiedKFold(n_splits=10)
clf = svm.SVC(kernel='linear', C=1)

X = iris.data
y = iris.target
precision_scores = []
recall_scores = []
for train_index, test_index in skf.split(X, y):
    X_train, X_test = X[train_index], X[test_index]
    y_train, y_test = y[train_index], y[test_index]

    y_pred = clf.fit(X_train, y_train).predict(X_test)
    precision_scores.append(precision_score(y_test, y_pred, average='micro'))
    recall_scores.append(recall_score(y_test, y_pred, average='micro'))

print(precision_scores)
print("Recall: %0.2f (+/- %0.2f)" % (np.mean(precision_scores), np.std(precision_scores) * 2))
print(recall_scores)
print("Recall: %0.2f (+/- %0.2f)" % (np.mean(recall_scores), np.std(recall_scores) * 2))

以上是关于使用sklearn获得精度和召回的主要内容,如果未能解决你的问题,请参考以下文章

使用 sklearn 获得精确度和召回率

精度、召回率、F1 分数等于 sklearn

如何使用 python 打印精度、召回率、f 分数?

如何使用 PySpark 测量逻辑回归的精度和召回率?

在 Keras 分类神经网络中进行精度交易以获得更好的召回率

你能解释一下提供的例子中的分类报告(召回率和精度)吗?