面对 AttributeError:'list' 对象没有属性 'lower'
Posted
技术标签:
【中文标题】面对 AttributeError:\'list\' 对象没有属性 \'lower\'【英文标题】:Facing AttributeError: 'list' object has no attribute 'lower'面对 AttributeError:'list' 对象没有属性 'lower' 【发布时间】:2019-02-14 13:15:54 【问题描述】:我已经发布了我的示例火车数据以及测试数据以及我的代码。我正在尝试使用朴素贝叶斯算法来训练模型。
但是,在评论中,我得到了列表列表。所以,我认为我的代码因以下错误而失败:
return lambda x: strip_accents(x.lower())
AttributeError: 'list' object has no attribute 'lower'
你们中的任何人都可以帮我解决一下我是 python 新手的问题吗....
train.txt:
review,label
Colors & clarity is superb,positive
Sadly the picture is not nearly as clear or bright as my 40 inch Samsung,negative
test.txt:
review,label
The picture is clear and beautiful,positive
Picture is not clear,negative
我的代码:
from sklearn.preprocessing import MultiLabelBinarizer
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import BernoulliNB
from sklearn.metrics import confusion_matrix
from sklearn.feature_extraction.text import CountVectorizer
def load_data(filename):
reviews = list()
labels = list()
with open(filename) as file:
file.readline()
for line in file:
line = line.strip().split(',')
labels.append(line[1])
reviews.append(line[0].split())
return reviews, labels
X_train, y_train = load_data('/Users/7000015504/Desktop/Sep_10/sample_train.csv')
X_test, y_test = load_data('/Users/7000015504/Desktop/Sep_10/sample_test.csv')
clf = CountVectorizer()
X_train_one_hot = clf.fit(X_train)
X_test_one_hot = clf.transform(X_test)
bnbc = BernoulliNB(binarize=None)
bnbc.fit(X_train_one_hot, y_train)
score = bnbc.score(X_test_one_hot, y_test)
print("score of Naive Bayes algo is :" , score)
【问题讨论】:
lower()
不是list
的属性。尝试将其转换为 numpy
数组。然后.lower()
应该可以工作了。
你的代码中return lambda x: strip_accents(x.lower())
在哪里?
这是一段简单的代码。可以自己管理,????
【参考方案1】:
我对您的代码进行了一些修改。下面发布的一个有效;我添加了有关如何调试您在上面发布的 cmets 的内容。
# These three will not used, do not import them
# from sklearn.preprocessing import MultiLabelBinarizer
# from sklearn.model_selection import train_test_split
# from sklearn.metrics import confusion_matrix
# This performs the classification task that you want with your input data in the format provided
from sklearn.naive_bayes import MultinomialNB
from sklearn.feature_extraction.text import CountVectorizer
def load_data(filename):
""" This function works, but you have to modify the second-to-last line from
reviews.append(line[0].split()) to reviews.append(line[0]).
CountVectorizer will perform the splits by itself as it sees fit, trust him :)"""
reviews = list()
labels = list()
with open(filename) as file:
file.readline()
for line in file:
line = line.strip().split(',')
labels.append(line[1])
reviews.append(line[0])
return reviews, labels
X_train, y_train = load_data('train.txt')
X_test, y_test = load_data('test.txt')
vec = CountVectorizer()
# Notice: clf means classifier, not vectorizer.
# While it is syntactically correct, it's bad practice to give misleading names to your objects.
# Replace "clf" with "vec" or something similar.
# Important! you called only the fit method, but did not transform the data
# afterwards. The fit method does not return the transformed data by itself. You
# either have to call .fit() and then .transform() on your training data, or just fit_transform() once.
X_train_transformed = vec.fit_transform(X_train)
X_test_transformed = vec.transform(X_test)
clf= MultinomialNB()
clf.fit(X_train_transformed, y_train)
score = clf.score(X_test_transformed, y_test)
print("score of Naive Bayes algo is :" , score)
这段代码的输出是:
score of Naive Bayes algo is : 0.5
【讨论】:
嗨@Daniel R。我们如何计算精度和召回率? Mhmh,这种格式的数据似乎允许从 sklearn.metrics 计算混淆矩阵,但既不包括precision_score,也不包括recall_score。我会研究一下,但现在您可以打印混淆矩阵并从那里手动计算它们,方法是添加:y_pred = clf.predict(X_test_transformed)
from sklearn.metrics import confusion_matrix
print(confusion_matrix(y_test,y_pred))
编辑:我成功了。必须将参数pos_label='positive'
传递给precision_score 函数,如下所示:y_pred = clf.predict(X_test_transformed)
from sklearn.metrics import precision_score
print(precision_score(y_test,y_pred,pos_label='positive'))
R,我已经尝试使用您上面编辑的代码。但我遇到了错误:“选择另一个平均设置”。 % y_type) ValueError: 目标是多类但平均='二进制'。请选择其他平均设置。
所以,我通过将属性 pos_label='positive' 替换为 average='micro' 来更改precision_score 和recall_score。 print("Precision Score : ",precision_score(y_test,y_pred,average='micro')) print("Recall Score :" ,recall_score(y_test, y_pred, average='micro') )【参考方案2】:
您需要遍历列表中的每个元素。
for item in list():
item = item.lower()
注意:仅当您遍历字符串列表 (dtype = str) 时才适用。
【讨论】:
你能在这里编辑我的代码吗?因为,我是python的新手。评论包含列表列表以上是关于面对 AttributeError:'list' 对象没有属性 'lower'的主要内容,如果未能解决你的问题,请参考以下文章
AttributeError:'list' 对象没有属性 'size'
AttributeError:“list”对象没有属性“startswith”
cx freeze : AttributeError: 'list' object has no attribute 'items'
它说 AttributeError: 'list' object has no attribute 'sample'
如何解决 AttributeError:'list' 对象在 python 中没有属性'keys' [关闭]
谁能告诉我为啥我收到错误 [AttributeError: 'list' object has no attribute 'encode']