朴素贝叶斯和 SVM 分类 - 如何在 x y 轴上绘制精度？

Posted 2023-03-12

技术标签:

【中文标题】朴素贝叶斯和 SVM 分类 - 如何在 x y 轴上绘制精度？【英文标题】：Naive Bayes and SVM classification - how to plot accuracy on x y axis? 【发布时间】：2020-08-25 12:54:00 【问题描述】：

我正在尝试生成一些带有 x 和 y 轴的线图，以展示运行分类的 2 种不同算法的准确性 - 朴素贝叶斯和 SVM。

我像这样训练/测试数据：

# split the dataset into training and validation datasets
train_x, valid_x, train_y, valid_y = model_selection.train_test_split(result['post'], result['type'], test_size=0.30, random_state=1)

# label encode the target variable
encoder = preprocessing.LabelEncoder()
train_y = encoder.fit_transform(train_y)
valid_y = encoder.fit_transform(valid_y)

def tokenizersplit(str):
    return str.split()
tfidf_vect = TfidfVectorizer(tokenizer=tokenizersplit, encoding='utf-8', min_df=2, ngram_range=(1, 2), max_features=25000)

tfidf_vect.fit(result['post'])
tfidf_vect.transform(result['post'])

xtrain_tfidf = tfidf_vect.transform(train_x)
xvalid_tfidf = tfidf_vect.transform(valid_x)

def train_model(classifier, trains, t_labels, valids, v_labels):
    # fit the training dataset on the classifier
    classifier.fit(trains, t_labels)

    # predict the labels on validation dataset
    predictions = classifier.predict(valids)

    return metrics.accuracy_score(predictions, v_labels)

# Naive Bayes
accuracy = train_model(naive_bayes.MultinomialNB(), xtrain_tfidf, train_y, xvalid_tfidf, valid_y)
print ("NB accuracy: ", accuracy)

但是，对于作业，我需要使用 matplotlib 在 x/y 轴上绘制一些东西。我试过这个：

m=linear_model.LogisticRegression()
m.fit(xtrain_tfidf, train_y)
y_pred = m.predict(xvalid_tfidf)
print(metrics.classification_report(valid_y, y_pred))
plt.plot(valid_y, y_pred)
plt.show()

但这给了我：

我需要一些可以更轻松地比较朴素贝叶斯与 SVM 与其他算法的准确性的东西。我怎样才能做到这一点？绘制分类报告：

plt.plot(metrics.classification_report(valid_y, y_pred))
plt.show()

我的分类输出：

  precision    recall  f1-score   support

           0       1.00      0.18      0.31        11
           1       0.00      0.00      0.00        14
           2       0.00      0.00      0.00        19
           3       0.50      0.77      0.61        66
           4       0.39      0.64      0.49        47
           5       0.00      0.00      0.00        23

    accuracy                           0.46       180
   macro avg       0.32      0.27      0.23       180
weighted avg       0.35      0.46      0.37       180

编辑时出错：

df = pd.DataFrame(metrics.classification_report(valid_y, y_pred)).transpose()

报错

ValueError: DataFrame 构造函数未正确调用！

【问题讨论】：

分类报告是一个表格，它不是用来绘制的 - 首先尝试简单地以classification_report(valid_y, y_pred) 运行它，看看它会返回什么。 @desertnaut 对。不一定需要在这里使用分类报告，但是如果我尝试仅绘制它，我什么也得不到（参见我的更新中的图片）你能发布metrics.classification_report(valid_y, y_pred)的输出吗？如果是表格，您可以通过传递 plt.scatter(x=..,y=..,..) 来散点图两个轴 @HirakSarkar 是的 - 看我的编辑它的表格你也可以打印metrics.classification_report(valid_y, y_pred).shape 吗？该表似乎被截断，似乎有超过 4 列，可能是把它放在一个变量中。 ``` df = metrics.classification_report(valid_y, y_pred) print(df.shape) print(df.columns) `` 【参考方案1】：

metrics.classification_report总结预测结果。所以这不是为了绘图而只是为了打印“报告”。如果您希望表格采用可视格式，您可以关注https://***.com/a/34304414/4005668。

否则你可以通过在数据帧中捕获它来获取数据帧

import pandas as pd
# put it in a dataframe
df = pd.DataFrame(metrics.classification_report(..)).transpose()
# plot the dataframe
df.plot()

【讨论】：

是的，但是这个答案已经过时了，导致了我的第二个问题***.com/questions/61705257/… 将我自己的分类报告插入数据框时出错。请参阅我的编辑。你能上传你的 jupyter notebook 或其他东西，以便我重现错误吗？因为对于我的玩具示例，它没有给出错误。

以上是关于朴素贝叶斯和 SVM 分类 - 如何在 x y 轴上绘制精度？的主要内容，如果未能解决你的问题，请参考以下文章