在 scikit learn 中准备和可视化分类报告时出现错误“FutureWarning: elementwise comparison failed”

Posted 2023-03-12

技术标签:

【中文标题】在 scikit learn 中准备和可视化分类报告时出现错误“FutureWarning: elementwise comparison failed”【英文标题】：Error "FutureWarning: elementwise comparison failed" while preparing and visualizing classification report in scikit learn 【发布时间】：2021-03-19 23:51:17 【问题描述】：

我尝试使用this answer 中给出的代码来可视化分类报告。

如果我不包含 classification_report() 的标签，它可以工作：

import numpy as np
import matplotlib.pyplot as plt
from sklearn.metrics import classification_report 

y_test_1 = np.loadtxt('ytest1.txt')
y_pred_1 = np.loadtxt('ypred1.txt')
_classification_report = classification_report(y_test_1, y_pred_1)

_classification_report = "\n".join(list(_classification_report.split("\n")[i] for i in [0,1,2,3,4,5,6,7,8,9,10,11,12,15]))

plot_classification_report(_classification_report)

可以在this colab notebook 中看到。

但如果我包含标签：

_categories=['blues', 'classical', 'country', 'disco', 'hiphop', 'jazz', 'metal', 'pop', 'reggae', 'rock']

y_test_1 = np.loadtxt('ytest1.txt')
y_pred_1 = np.loadtxt('ypred1.txt')
_classification_report = classification_report(y_test_1, y_pred_1, labels = _categories)

_classification_report = "\n".join(list(_classification_report.split("\n")[i] for i in [0,1,2,3,4,5,6,7,8,9,10,11,12,15]))

plot_classification_report(_classification_report)

它开始给出以下错误：

F:\ProgramFiles\python\lib\site-packages\numpy\lib\arraysetops.py:565: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison
  mask &= (ar1 != a)
F:\ProgramFiles\python\lib\site-packages\sklearn\metrics\_classification.py:1221: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior.
  _warn_prf(average, modifier, msg_start, len(result))
F:\ProgramFiles\python\lib\site-packages\sklearn\metrics\_classification.py:1221: UndefinedMetricWarning: Recall and F-score are ill-defined and being set to 0.0 in labels with no true samples. Use `zero_division` parameter to control this behavior.
  _warn_prf(average, modifier, msg_start, len(result))
F:\ProgramFiles\python\lib\site-packages\sklearn\metrics\_classification.py:1221: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
  _warn_prf(average, modifier, msg_start, len(result))
F:\ProgramFiles\python\lib\site-packages\sklearn\metrics\_classification.py:1221: UndefinedMetricWarning: Recall and F-score are ill-defined and being set to 0.0 due to no true samples. Use `zero_division` parameter to control this behavior.
  _warn_prf(average, modifier, msg_start, len(result))

我不知道这里的确切原因是什么。它可能是以下两种之一：

blues

classical

y_test

y_pred

str

您可以找到ytest1.txt 和ypred1.txt here。

【问题讨论】：

【参考方案1】：

我错误地认为classification_report 的labels 参数接受要在图形上显示的标签，因为我没有意识到它对数据执行过滤。我觉得它只是将标签放在输出图像中（因为我觉得文档对labels 的功能有点模棱两可）。考虑到我可能错了，我尝试将编码的标签转换为原始形式。

我进行如下编码：

from sklearn.preprocessing import LabelEncoder
y = #load data set...
y = encoder.fit_transform(y)

使用相同的encoder 对象，我使用其inverse_transform() 方法获得原始标签，如下所示：

if y_test.dtype == 'int32' or y_test.dtype == 'int64':
    _classification_report = classification_report(encoder.inverse_transform(y_test), encoder.inverse_transform(y_pred), labels=_categories)

如果有更标准的方法，请告诉我。

【讨论】：

以上是关于在 scikit learn 中准备和可视化分类报告时出现错误“FutureWarning: elementwise comparison failed”的主要内容，如果未能解决你的问题，请参考以下文章