TextBlob实战之朴素贝叶斯文本分类

Posted AI小白入门

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了TextBlob实战之朴素贝叶斯文本分类相关的知识,希望对你有一定的参考价值。

TextBlob实战之朴素贝叶斯文本分类


  • 一个使用TextBlob进行Naive Bayes classifier

  • 参考:https://textblob.readthedocs.io/en/dev/classifiers.html#classifiers


1.准备数据集:训练集和测试集

train = [ ...     ('I love this sandwich.', 'pos'), ...     ('this is an amazing place!', 'pos'), ...     ('I feel very good about these beers.', 'pos'), ...     ('this is my best work.', 'pos'), ...     ("what an awesome view", 'pos'), ...     ('I do not like this restaurant', 'neg'), ...     ('I am tired of this stuff.', 'neg'), ...     ("I can't deal with this", 'neg'), ...     ('he is my sworn enemy!', 'neg'), ...     ('my boss is horrible.', 'neg') ... ]

test = [ ...     ('the beer was good.', 'pos'), ...     ('I do not enjoy my job', 'neg'), ...     ("I ain't feeling dandy today.", 'neg'), ...     ("I feel amazing!", 'pos'), ...     ('Gary is a friend of mine.', 'pos'), ...     ("I can't believe I'm doing this.", 'neg') ... ]


2.创建朴素贝叶斯分类器

from textblob.classifiers import NaiveBayesClassifier


3.把训练丢进去训练

nb_model = NaiveBayesClassifier(train)


4.预测新来的样本

dev_sen = "This is an amazing library!" print(nb_model.classify(dev_sen))

pos

也可以计算属于某一类的概率

dev_sen_prob = nb_model.prob_classify(dev_sen) print(dev_sen_prob.prob("pos"))

0.980117820324005


5.计算模型在测试集上的精确度

print(nb_model.accuracy(test))

0.8333333333333334





另外,代码我已经上传github:https://github.com/yuquanle/StudyForNLP/blob/master/NLPtools/TextBlob2TextClassifier.ipynb





更多个人笔记请关注:

知乎专栏:https://www.zhihu.com/people/yuquanle/columns


以上是关于TextBlob实战之朴素贝叶斯文本分类的主要内容,如果未能解决你的问题,请参考以下文章

NLP实战系列朴素贝叶斯文本分类实战

机器学习—朴素贝叶斯

机器学习基础——朴素贝叶斯做文本分类代码实战

阿旭机器学习实战33中文文本分类之情感分析--朴素贝叶斯KNN逻辑回归

《机器学习实战》基于朴素贝叶斯分类算法构建文本分类器的Python实现

实战:朴素贝叶斯对文档进行分类