如何在 Python 中编写混淆矩阵？

Posted 2023-02-23

技术标签:

【中文标题】如何在 Python 中编写混淆矩阵？【英文标题】：How to write a confusion matrix in Python? 【发布时间】：2011-01-10 01:12:57 【问题描述】：

我用Python写了一个混淆矩阵计算代码：

def conf_mat(prob_arr, input_arr):
        # confusion matrix
        conf_arr = [[0, 0], [0, 0]]

        for i in range(len(prob_arr)):
                if int(input_arr[i]) == 1:
                        if float(prob_arr[i]) < 0.5:
                                conf_arr[0][1] = conf_arr[0][1] + 1
                        else:
                                conf_arr[0][0] = conf_arr[0][0] + 1
                elif int(input_arr[i]) == 2:
                        if float(prob_arr[i]) >= 0.5:
                                conf_arr[1][0] = conf_arr[1][0] +1
                        else:
                                conf_arr[1][1] = conf_arr[1][1] +1

        accuracy = float(conf_arr[0][0] + conf_arr[1][1])/(len(input_arr))

prob_arr 是我的分类代码返回的一个数组，一个示例数组是这样的：

 [1.0, 1.0, 1.0, 0.41592955657342651, 1.0, 0.0053405015805891975, 4.5321494433440449e-299, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.70943426182688163, 1.0, 1.0, 1.0, 1.0]

input_arr 是数据集的原始类标签，如下所示：

[2, 1, 1, 1, 1, 1, 2, 1, 1, 2, 1, 1, 2, 1, 2, 1, 1, 1]

我的代码试图做的是：我得到 prob_arr 和 input_arr 并且对于每个类（1 和 2）我检查它们是否被错误分类。

但我的代码只适用于两个类。如果我为多个分类数据运行此代码，它将不起作用。我怎样才能为多个课程制作这个？

例如，对于具有三个类的数据集，它应该返回：[[21,7,3],[3,38,6],[5,4,19]]

【问题讨论】：

【参考方案1】：

您应该从类映射到混淆矩阵中的一行。

这里的映射很简单：

def row_of_class(classe):
    return 1: 0, 2: 1[classe]

在您的循环中，计算expected_row、correct_row，并递增conf_arr[expected_row][correct_row]。您的代码甚至会比开始时更少。

【讨论】：

【参考方案2】：

此函数为任意数量的类创建混淆矩阵。

def create_conf_matrix(expected, predicted, n_classes):
    m = [[0] * n_classes for i in range(n_classes)]
    for pred, exp in zip(predicted, expected):
        m[pred][exp] += 1
    return m

def calc_accuracy(conf_matrix):
    t = sum(sum(l) for l in conf_matrix)
    return sum(conf_matrix[i][i] for i in range(len(conf_matrix))) / t

与您上面的函数相比，您必须在调用函数之前根据您的分类结果提取预测的类，即某事。喜欢

[1 if p < .5 else 2 for p in classifications]

【讨论】：

这就像给出了一个语法错误，但我在 Python 中还不够好，无法修复它:) m = [[0] * n_classes] for i in range(n_classes)] ^ SyntaxError: invalid syntax 我想你还需要一个[:m = [[[0] * ... 其实少了一个:)---固定。您可能已经创建了 transposed 混淆矩阵。如果您能看看这个亲爱的，我将不胜感激。感谢您的帮助。 ***.com/questions/44215561/…【参考方案3】：

一般来说，您需要更改概率数组。不是每个实例都有一个数字并根据它是否大于 0.5 进行分类，而是需要一个分数列表（每个班级一个），然后将最大的分数作为班级选择（又名 argmax）。

您可以使用字典来保存每个分类的概率：

prob_arr = [classification_id: probability, ...]

选择一个分类类似于：

for instance_scores in prob_arr :
    predicted_classes = [cls for (cls, score) in instance_scores.iteritems() if score = max(instance_scores.values())]

这处理两个班级得分相同的情况。您可以通过选择该列表中的第一个分数来获得一个分数，但如何处理取决于您的分类。

获得预测类列表和预期类列表后，您可以使用 Torsten Marek 之类的代码创建混淆数组并计算准确度。

【讨论】：

【参考方案4】：

您可以使用numpy 使您的代码更简洁并且（有时）运行得更快。例如，在两个类的情况下，您的函数可以重写为（参见mply.acc()）：

def accuracy(actual, predicted):
    """accuracy = (tp + tn) / ts

    , where:    

        ts - Total Samples
        tp - True Positives
        tn - True Negatives
    """
    return (actual == predicted).sum() / float(len(actual))

，其中：

actual    = (numpy.array(input_arr) == 2)
predicted = (numpy.array(prob_arr) < 0.5)

【讨论】：

如果您能看看这个亲爱的，我将不胜感激。感谢您的帮助。 ***.com/questions/44215561/… 使用 .mean() 代替 sum 并避免除法（注意答案是 11 年前的，当时 Numpy 可能没有均值函数）。【参考方案5】：

Scikit-learn（我还是建议使用它）已将其包含在 metrics 模块中：

>>> from sklearn.metrics import confusion_matrix
>>> y_true = [0, 1, 2, 0, 1, 2, 0, 1, 2]
>>> y_pred = [0, 0, 0, 0, 1, 1, 0, 2, 2]
>>> confusion_matrix(y_true, y_pred)
array([[3, 0, 0],
       [1, 1, 1],
       [1, 1, 1]])

【讨论】：

【参考方案6】：

Scikit-Learn 提供了一个confusion_matrix 函数

from sklearn.metrics import confusion_matrix
y_actu = [2, 0, 2, 2, 0, 1, 1, 2, 2, 0, 1, 2]
y_pred = [0, 0, 2, 1, 0, 2, 1, 0, 2, 0, 2, 2]
confusion_matrix(y_actu, y_pred)

输出一个 Numpy 数组

array([[3, 0, 0],
       [0, 1, 2],
       [2, 1, 3]])

但您也可以使用 Pandas 创建混淆矩阵：

import pandas as pd
y_actu = pd.Series([2, 0, 2, 2, 0, 1, 1, 2, 2, 0, 1, 2], name='Actual')
y_pred = pd.Series([0, 0, 2, 1, 0, 2, 1, 0, 2, 0, 2, 2], name='Predicted')
df_confusion = pd.crosstab(y_actu, y_pred)

你会得到一个（标签很好的）Pandas DataFrame：

Predicted  0  1  2
Actual
0          3  0  0
1          0  1  2
2          2  1  3

如果加margins=True点赞

df_confusion = pd.crosstab(y_actu, y_pred, rownames=['Actual'], colnames=['Predicted'], margins=True)

您还将获得每一行和每一列的总和：

Predicted  0  1  2  All
Actual
0          3  0  0    3
1          0  1  2    3
2          2  1  3    6
All        5  2  5   12

您还可以使用以下方法获得归一化的混淆矩阵：

df_conf_norm = df_confusion / df_confusion.sum(axis=1)

Predicted         0         1         2
Actual
0          1.000000  0.000000  0.000000
1          0.000000  0.333333  0.333333
2          0.666667  0.333333  0.500000

您可以使用以下方法绘制这个混淆矩阵

import matplotlib.pyplot as plt
def plot_confusion_matrix(df_confusion, title='Confusion matrix', cmap=plt.cm.gray_r):
    plt.matshow(df_confusion, cmap=cmap) # imshow
    #plt.title(title)
    plt.colorbar()
    tick_marks = np.arange(len(df_confusion.columns))
    plt.xticks(tick_marks, df_confusion.columns, rotation=45)
    plt.yticks(tick_marks, df_confusion.index)
    #plt.tight_layout()
    plt.ylabel(df_confusion.index.name)
    plt.xlabel(df_confusion.columns.name)

plot_confusion_matrix(df_confusion)

或使用以下方法绘制归一化混淆矩阵：

plot_confusion_matrix(df_conf_norm)

您可能也对这个项目https://github.com/pandas-ml/pandas-ml 及其 Pip 包https://pypi.python.org/pypi/pandas_ml 感兴趣

有了这个包，混淆矩阵可以很漂亮地打印出来。您可以对混淆矩阵进行二值化，获取类统计信息，例如 TP、TN、FP、FN、ACC、TPR、FPR、FNR、TNR (SPC)、LR+、LR-、DOR、PPV、FDR、FOR、NPV 和一些整体统计数据

In [1]: from pandas_ml import ConfusionMatrix
In [2]: y_actu = [2, 0, 2, 2, 0, 1, 1, 2, 2, 0, 1, 2]
In [3]: y_pred = [0, 0, 2, 1, 0, 2, 1, 0, 2, 0, 2, 2]
In [4]: cm = ConfusionMatrix(y_actu, y_pred)
In [5]: cm.print_stats()
Confusion Matrix:

Predicted  0  1  2  __all__
Actual
0          3  0  0        3
1          0  1  2        3
2          2  1  3        6
__all__    5  2  5       12


Overall Statistics:

Accuracy: 0.583333333333
95% CI: (0.27666968568210581, 0.84834777019156982)
No Information Rate: ToDo
P-Value [Acc > NIR]: 0.189264302376
Kappa: 0.354838709677
Mcnemar's Test P-Value: ToDo


Class Statistics:

Classes                                        0          1          2
Population                                    12         12         12
P: Condition positive                          3          3          6
N: Condition negative                          9          9          6
Test outcome positive                          5          2          5
Test outcome negative                          7         10          7
TP: True Positive                              3          1          3
TN: True Negative                              7          8          4
FP: False Positive                             2          1          2
FN: False Negative                             0          2          3
TPR: (Sensitivity, hit rate, recall)           1  0.3333333        0.5
TNR=SPC: (Specificity)                 0.7777778  0.8888889  0.6666667
PPV: Pos Pred Value (Precision)              0.6        0.5        0.6
NPV: Neg Pred Value                            1        0.8  0.5714286
FPR: False-out                         0.2222222  0.1111111  0.3333333
FDR: False Discovery Rate                    0.4        0.5        0.4
FNR: Miss Rate                                 0  0.6666667        0.5
ACC: Accuracy                          0.8333333       0.75  0.5833333
F1 score                                    0.75        0.4  0.5454545
MCC: Matthews correlation coefficient  0.6831301  0.2581989  0.1690309
Informedness                           0.7777778  0.2222222  0.1666667
Markedness                                   0.6        0.3  0.1714286
Prevalence                                  0.25       0.25        0.5
LR+: Positive likelihood ratio               4.5          3        1.5
LR-: Negative likelihood ratio                 0       0.75       0.75
DOR: Diagnostic odds ratio                   inf          4          2
FOR: False omission rate                       0        0.2  0.4285714

我注意到一个名为 PyCM 的关于混淆矩阵的新 Python 库已经发布：也许你可以看看。

【讨论】：

如果您能看看这个亲爱的，我将不胜感激。感谢您的帮助。 ***.com/questions/44215561/… df_conf_norm = df_confusion / df_confusion.sum(axis=1) 没有创建归一化的混淆矩阵：行的总和应为 1。您实际上需要：df_confusion.values / df_confusion.sum(axis=1)[:,None] 虽然这会创建一个 numpy 数组，因为 pandas 会抱怨没有.values。见：***.com/questions/19602187/… 为了绘制混淆矩阵，您可以使用 seaborn 热图：sns.heatmap(df_conf_norm, annot=True) 我也同意之前关于规范化问题的评论。这就是我标准化的方式： df_conf_norm = df_confusion.div(df_confusion.sum(axis=1), axis=0)【参考方案7】：

如果您不想让 scikit-learn 为您完成这项工作...

    import numpy
    actual = numpy.array(actual)
    predicted = numpy.array(predicted)

    # calculate the confusion matrix; labels is numpy array of classification labels
    cm = numpy.zeros((len(labels), len(labels)))
    for a, p in zip(actual, predicted):
        cm[a][p] += 1

    # also get the accuracy easily with numpy
    accuracy = (actual == predicted).sum() / float(len(actual))

或在NLTK 中查看更完整的实现。

【讨论】：

如果您能看看这个亲爱的，我将不胜感激。感谢您的帮助。 ***.com/questions/44215561/…【参考方案8】：

我编写了一个简单的类来构建混淆矩阵，而无需依赖机器学习库。

类可以使用如：

labels = ["cat", "dog", "velociraptor", "kraken", "pony"]
confusionMatrix = ConfusionMatrix(labels)

confusionMatrix.update("cat", "cat")
confusionMatrix.update("cat", "dog")
...
confusionMatrix.update("kraken", "velociraptor")
confusionMatrix.update("velociraptor", "velociraptor")

confusionMatrix.plot()

ConfusionMatrix 类：

import pylab
import collections
import numpy as np


class ConfusionMatrix:
    def __init__(self, labels):
        self.labels = labels
        self.confusion_dictionary = self.build_confusion_dictionary(labels)

    def update(self, predicted_label, expected_label):
        self.confusion_dictionary[expected_label][predicted_label] += 1

    def build_confusion_dictionary(self, label_set):
        expected_labels = collections.OrderedDict()

        for expected_label in label_set:
            expected_labels[expected_label] = collections.OrderedDict()

            for predicted_label in label_set:
                expected_labels[expected_label][predicted_label] = 0.0

        return expected_labels

    def convert_to_matrix(self, dictionary):
        length = len(dictionary)
        confusion_dictionary = np.zeros((length, length))

        i = 0
        for row in dictionary:
            j = 0
            for column in dictionary:
                confusion_dictionary[i][j] = dictionary[row][column]
                j += 1
            i += 1

        return confusion_dictionary

    def get_confusion_matrix(self):
        matrix = self.convert_to_matrix(self.confusion_dictionary)
        return self.normalize(matrix)

    def normalize(self, matrix):
        amin = np.amin(matrix)
        amax = np.amax(matrix)

        return [[(((y - amin) * (1 - 0)) / (amax - amin)) for y in x] for x in matrix]

    def plot(self):
        matrix = self.get_confusion_matrix()

        pylab.figure()
        pylab.imshow(matrix, interpolation='nearest', cmap=pylab.cm.jet)
        pylab.title("Confusion Matrix")

        for i, vi in enumerate(matrix):
            for j, vj in enumerate(vi):
                pylab.text(j, i+.1, "%.1f" % vj, fontsize=12)

        pylab.colorbar()

        classes = np.arange(len(self.labels))
        pylab.xticks(classes, self.labels)
        pylab.yticks(classes, self.labels)

        pylab.ylabel('Expected label')
        pylab.xlabel('Predicted label')
        pylab.show()

【讨论】：

【参考方案9】：

只有使用numpy，考虑到效率，我们可以这样做：

def confusion_matrix(pred, label, nc=None):
    assert pred.size == label.size
    if nc is None:
        nc = len(unique(label))
        logging.debug("Number of classes assumed to be ".format(nc))

    confusion = np.zeros([nc, nc])
    # avoid the confusion with `0`
    tran_pred = pred + 1
    for i in xrange(nc):    # current class
        mask = (label == i)
        masked_pred = mask * tran_pred
        cls, counts = unique(masked_pred, return_counts=True)
        # discard the first item
        cls = [cl - 1 for cl in cls][1:]
        counts = counts[1:]
        for cl, count in zip(cls, counts):
            confusion[i, cl] = count
    return confusion

有关绘图、mean-IoU 等其他功能，请参阅my repositories。

【讨论】：

【参考方案10】：

无依赖的多类混淆矩阵

# A Simple Confusion Matrix Implementation
def confusionmatrix(actual, predicted, normalize = False):
    """
    Generate a confusion matrix for multiple classification
    @params:
        actual      - a list of integers or strings for known classes
        predicted   - a list of integers or strings for predicted classes
        normalize   - optional boolean for matrix normalization
    @return:
        matrix      - a 2-dimensional list of pairwise counts
    """
    unique = sorted(set(actual))
    matrix = [[0 for _ in unique] for _ in unique]
    imap   = key: i for i, key in enumerate(unique)
    # Generate Confusion Matrix
    for p, a in zip(predicted, actual):
        matrix[imap[p]][imap[a]] += 1
    # Matrix Normalization
    if normalize:
        sigma = sum([sum(matrix[imap[i]]) for i in unique])
        matrix = [row for row in map(lambda i: list(map(lambda j: j / sigma, i)), matrix)]
    return matrix

这里的方法是将actual 向量中的唯一类配对成一个二维列表。从那里，我们只需遍历压缩的 actual 和 predicted 向量并使用索引填充计数以访问矩阵位置。

用法

cm = confusionmatrix(
    [1, 1, 2, 0, 1, 1, 2, 0, 0, 1], # actual
    [0, 1, 1, 0, 2, 1, 2, 2, 0, 2]  # predicted
)

# And The Output
print(cm)
[[2, 1, 0], [0, 2, 1], [1, 2, 1]]

注意：actual 类位于列中，predicted 类位于行中。

# Actual
# 0  1  2
  #  #  #   
[[2, 1, 0], # 0
 [0, 2, 1], # 1  Predicted
 [1, 2, 1]] # 2

类名可以是字符串或整数

cm = confusionmatrix(
    ["B", "B", "C", "A", "B", "B", "C", "A", "A", "B"], # actual
    ["A", "B", "B", "A", "C", "B", "C", "C", "A", "C"]  # predicted
)

# And The Output
print(cm)
[[2, 1, 0], [0, 2, 1], [1, 2, 1]]

您还可以返回具有比例的矩阵（归一化）

cm = confusionmatrix(
    ["B", "B", "C", "A", "B", "B", "C", "A", "A", "B"], # actual
    ["A", "B", "B", "A", "C", "B", "C", "C", "A", "C"], # predicted
    normalize = True
)

# And The Output
print(cm)
[[0.2, 0.1, 0.0], [0.0, 0.2, 0.1], [0.1, 0.2, 0.1]]

更强大的解决方案

自从写这篇文章以来，我已经将我的库实现更新为一个在内部使用混淆矩阵表示来计算统计数据的类，此外还可以漂亮地打印混淆矩阵本身。看到这个Gist。

示例用法

# Actual & Predicted Classes
actual      = ["A", "B", "C", "C", "B", "C", "C", "B", "A", "A", "B", "A", "B", "C", "A", "B", "C"]
predicted   = ["A", "B", "B", "C", "A", "C", "A", "B", "C", "A", "B", "B", "B", "C", "A", "A", "C"]

# Initialize Performance Class
performance = Performance(actual, predicted)

# Print Confusion Matrix
performance.tabulate()

输出：

===================================
        Aᴬ      Bᴬ      Cᴬ

Aᴾ      3       2       1

Bᴾ      1       4       1

Cᴾ      1       0       4

Note: classᴾ = Predicted, classᴬ = Actual
===================================

对于归一化矩阵：

# Print Normalized Confusion Matrix
performance.tabulate(normalized = True)

使用标准化输出：

===================================
        Aᴬ      Bᴬ      Cᴬ

Aᴾ      17.65%  11.76%  5.88%

Bᴾ      5.88%   23.53%  5.88%

Cᴾ      5.88%   0.00%   23.53%

Note: classᴾ = Predicted, classᴬ = Actual
===================================

【讨论】：

【参考方案11】：

近十年过去了，但这篇文章的解决方案（没有 sklearn）令人费解且不必要地冗长。计算混淆矩阵可以在 Python 中用几行代码完成。例如：

import numpy as np

def compute_confusion_matrix(true, pred):
  '''Computes a confusion matrix using numpy for two np.arrays
  true and pred.

  Results are identical (and similar in computation time) to: 
    "from sklearn.metrics import confusion_matrix"

  However, this function avoids the dependency on sklearn.'''

  K = len(np.unique(true)) # Number of classes 
  result = np.zeros((K, K))

  for i in range(len(true)):
    result[true[i]][pred[i]] += 1

  return result

【讨论】：

您可以使用@numba.jit 使其速度提高 10 倍以上：numpy：每个循环 83 毫秒，numba：每个循环 2.4 毫秒（第一次调用除外）这不是只有在标签是正整数时才有效吗？字符串和浮点数呢？ @Ali Gröch - 只需使用辅助函数映射标签。辅助函数可以简单到return dict(zip(range(len(np.unique(labels))), np.unique(labels)))【参考方案12】：

一个 numpy-only 解决方案，适用于不需要循环的任意数量的类：

import numpy as np

classes = 3
true = np.random.randint(0, classes, 50)
pred = np.random.randint(0, classes, 50)

np.bincount(true * classes + pred).reshape((classes, classes))

【讨论】：

一点改进：classes = np.unique(pred).size true * classes 不是元素乘法吗？这是一个很棒且非常快速的解决方案！不幸的是，如果不是所有类都在两个数组中表示，则它不起作用：len(np.unique(true)) != classes 或 len(np.unique(pred)) != classes。【参考方案13】：

这是一个简单的实现，它处理预测标签和实际标签中不相等数量的类（参见示例 3 和 4）。我希望这会有所帮助！

对于刚刚学习这一点的人，这里有一个快速回顾。列的标签表示预测的类，行的标签表示正确的类。在示例 1 中，我们在顶行有 [3 1]。同样，行表示真实，因此这意味着正确的标签为“0”，并且有 4 个示例的基本真实标签为“0”。列表示预测，因此我们有 3/4 的样本被正确标记为“0”，但 1/4 被错误地标记为“1”。

def confusion_matrix(actual, predicted):
    classes       = np.unique(np.concatenate((actual,predicted)))
    confusion_mtx = np.empty((len(classes),len(classes)),dtype=np.int)
    for i,a in enumerate(classes):
        for j,p in enumerate(classes):
            confusion_mtx[i,j] = np.where((actual==a)*(predicted==p))[0].shape[0]
    return confusion_mtx

示例 1：

actual    = np.array([1,1,1,1,0,0,0,0])
predicted = np.array([1,1,1,1,0,0,0,1])
confusion_matrix(actual,predicted)

   0  1
0  3  1
1  0  4

示例 2：

actual    = np.array(["a","a","a","a","b","b","b","b"])
predicted = np.array(["a","a","a","a","b","b","b","a"])
confusion_matrix(actual,predicted)

   0  1
0  4  0
1  1  3

示例 3：

actual    = np.array(["a","a","a","a","b","b","b","b"])
predicted = np.array(["a","a","a","a","b","b","b","z"]) # <-- notice the 3rd class, "z"
confusion_matrix(actual,predicted)

   0  1  2
0  4  0  0
1  0  3  1
2  0  0  0

示例 4：

actual    = np.array(["a","a","a","x","x","b","b","b"]) # <-- notice the 4th class, "x"
predicted = np.array(["a","a","a","a","b","b","b","z"])
confusion_matrix(actual,predicted)

   0  1  2  3
0  3  0  0  0
1  0  2  0  1
2  1  1  0  0
3  0  0  0  0

【讨论】：

【参考方案14】：

cgnorthcutt 解决方案的小改动，考虑到字符串类型变量

def get_confusion_matrix(l1, l2):

    assert len(l1)==len(l2), "Two lists have different size."

    K = len(np.unique(l1))

    # create label-index value
    label_index = dict(zip(np.unique(l1), np.arange(K)))

    result = np.zeros((K, K))
    for i in range(len(l1)):
        result[label_index[l1[i]]][label_index[l2[i]]] += 1

    return result

【讨论】：

【参考方案15】：

可以简单计算如下：

def confusionMatrix(actual, pred):

   TP = (actual==pred)[actual].sum()
   TN = (actual==pred)[~actual].sum()
   FP = (actual!=pred)[~actual].sum()
   FN = (actual!=pred)[actual].sum()

   return [[TP, TN], [FP, FN]]

【讨论】：

欢迎来到 ***。请解释您的答案以及为什么它可能是提问者正在寻找的答案。【参考方案16】：

虽然 sklearn 解决方案非常干净，但如果将其与仅 numpy 的解决方案进行比较，它确实很慢。让我给你一个例子和一个更好/更快的解决方案。

import time
import numpy as np
from sklearn.metrics import confusion_matrix

num_classes = 3

true = np.random.randint(0, num_classes, 10000000)
pred = np.random.randint(0, num_classes, 10000000)

先参考sklearn解决方案

start = time.time()
confusion = confusion_matrix(true, pred)

print('time: ' + str(time.time() - start)) # time: 9.31

现在一个更快的解决方案只使用 numpy。在这种情况下，我们不是遍历所有样本，而是遍历混淆矩阵并计算每个单元格的值。这使得这个过程非常快。

start = time.time()

confusion = np.zeros((num_classes, num_classes)).astype(np.int64)

for i in range(num_classes):
    for j in range(num_classes):
        confusion[i][j] = np.sum(np.logical_and(true == i, pred == j))

print('time: ' + str(time.time() - start)) # time: 0.34

【讨论】：

【参考方案17】：

实际上，我厌倦了在实验中总是需要对混淆矩阵进行编码。所以，我已经为它构建了自己的简单 pypi 包。

只需使用pip install easycm 安装即可

然后，用from easycm import plot_confusion_matrix导入函数

最后，用plot_confusion_matrix(y_true, y_pred)绘制数据

【讨论】：

【参考方案18】：

我实际上已经厌倦了总是需要在我的实验中编写我的混淆矩阵。所以，我已经为它构建了自己的简单 pypi 包。

只需安装它

pip install easycm

然后，导入函数并使用它。

from easycm import plot_confusion_matrix

...

plot_confusion_matrix(y_true, y_pred)

【讨论】：

以上是关于如何在 Python 中编写混淆矩阵？的主要内容，如果未能解决你的问题，请参考以下文章

应用分层10折交叉验证时如何在python中获取所有混淆矩阵的聚合

如何从 Python 中的混淆矩阵中获取精度、召回率和 f 度量 [重复]

如何在python中使用修改后的输出大小绘制混淆矩阵并输出为.svg图像？

带有阈值python的混淆矩阵

遥感软件中混淆矩阵是如何产生的

Python hmmlearn中的混淆矩阵是怎么表示的