没有估计器的 plot_confusion_matrix

Posted 2023-03-12

技术标签:

【中文标题】没有估计器的 plot_confusion_matrix【英文标题】：plot_confusion_matrix without estimator 【发布时间】：2020-07-01 17:50:28 【问题描述】：

我正在尝试使用 plot_confusion_matrix，

from sklearn.metrics import confusion_matrix

y_true = [1, 1, 0, 1]
y_pred = [1, 1, 0, 0]

confusion_matrix(y_true, y_pred)

输出：

array([[1, 0],
       [1, 2]])

现在，在使用以下内容时；使用“类”或不使用“类”

from sklearn.metrics import plot_confusion_matrix

plot_confusion_matrix(y_true, y_pred, classes=[0,1], title='Confusion matrix, without normalization')

或

plot_confusion_matrix(y_true, y_pred, title='Confusion matrix, without normalization')

我希望得到类似这样的输出，除了里面的数字，

绘制简单的图表，它应该不需要估算器。

使用 mlxtend.plotting，

from mlxtend.plotting import plot_confusion_matrix
import matplotlib.pyplot as plt
import numpy as np

binary1 = np.array([[4, 1],
                   [1, 2]])

fig, ax = plot_confusion_matrix(conf_mat=binary1)
plt.show()

它提供相同的输出。

基于this

它需要一个分类器，

disp = plot_confusion_matrix(classifier, X_test, y_test,
                                 display_labels=class_names,
                                 cmap=plt.cm.Blues,
                                 normalize=normalize)

我可以在没有分类器的情况下绘制它吗？

【问题讨论】：

我可以在没有分类器的情况下绘制它吗？ 【参考方案1】：

plot_confusion_matrix 需要经过训练的分类器。如果您查看source code，它所做的是执行预测以为您生成y_pred：

y_pred = estimator.predict(X)
    cm = confusion_matrix(y_true, y_pred, sample_weight=sample_weight,
                          labels=labels, normalize=normalize)

因此，为了在不指定分类器的情况下绘制混淆矩阵，您必须使用其他工具，或者自己动手。一个简单的选择是使用 seaborn：

import seaborn as sns

cm = confusion_matrix(y_true, y_pred)
f = sns.heatmap(cm, annot=True)

【讨论】：

谢谢。我必须使用 pandas 和 seaborn 来实现它。我可以使用 scikit-learn 的工具吗？我在这里看到，它没有使用[分类器]，(datascience.stackexchange.com/questions/40067/…) 没有@RakibulHassan。在链接中，正在使用自定义功能。为什么不自己定义呢？谢谢。我认为它看起来很相似，我们可以不使用 ConfusionMatrixDisplay 吗？【参考方案2】：

我在这里有点晚了，但我认为其他人可能会从我的回答中受益。

正如其他人提到的那样，使用plot_confusion_matrix 不是没有分类器的选项，但仍然可以使用 sklearn 在没有分类器的情况下获得外观相似的混淆矩阵。下面的函数正是这样做的。

def confusion_ma(y_true, y_pred, class_names):
    cm = confusion_matrix(y_true, y_pred, normalize='true')
    disp = ConfusionMatrixDisplay(confusion_matrix=cm, display_labels=class_names)
    disp.plot(cmap=plt.cm.Blues)
    return plt.show()

confusion_matrix 函数返回一个简单的 ndarry 矩阵。通过将其与预测标签一起传递给ConfusionMatrixDisplay 函数，可以获得类似的矩阵。在定义中，我添加了要显示的 class_names 而不是 0 和 1，选择用于标准化输出并指定颜色图 - 根据您的需要进行相应更改。

【讨论】：

其实这应该是公认的答案！【参考方案3】：

由于 plot_confusion_matrix 要求参数 'estimator' 不为 None，答案是：不，你不能。但是你可以用其他方式绘制你的混淆矩阵，例如看到这个答案：How can I plot a confusion matrix?

【讨论】：

【参考方案4】：

我在 Amazon SageMaker 中运行 conda_python3 内核的 Jupyter 笔记本中测试了以下“身份分类器”。原因是 SageMaker 的转换工作是异步的，因此不允许在 plot_confusion_matrix 的参数中使用分类器，y_pred 必须在调用函数之前计算。

IC = type('IdentityClassifier', (), "predict": lambda i : i, "_estimator_type": "classifier")
plot_confusion_matrix(IC, y_pred, y_test, normalize='true', values_format='.2%');

因此，虽然 plot_confusion_matrix 确实需要估算器，但如果此解决方案适合您的用例，您不一定必须使用其他工具 IMO。

simplified POC from the notebook

【讨论】：

【参考方案5】：

我解决了使用自定义分类器的问题；您可以构建任何自定义分类器并将其作为类传递给 plot_confusion 矩阵：

class MyModelPredict(object):
    def __init__(self, model):
        self._estimator_type = 'classifier'
        
    def predict(self, X):
        return your_custom_prediction

model = MyModelPredict()
plot_confusion_matrix(model, X, y_true)

【讨论】：

以上是关于没有估计器的 plot_confusion_matrix的主要内容，如果未能解决你的问题，请参考以下文章