Pandas-文本分析

Posted 2023-03-06

技术标签:

【中文标题】Pandas-文本分析【英文标题】：Pandas-Profiling in text 【发布时间】：2021-07-12 18:16:44 【问题描述】：

我想在文本中应用分析，但在我必须清理和识别最常用的单词之前。但是当我应用 nltk 时，它会返回一个列表，我无法创建分析。有没有办法做到这一点？

corpus = []
for i in range(17000):
  review = re.sub('[^a-zA-Z]', ' ', dataset['Descrição Reparo'][i])
  review = review.lower().split()
  review = [word for word in review if not word in set(stopwords.words('portuguese'))]
  review = ' '.join(review)
  corpus.append(review)

cv = CountVectorizer()
X = cv.fit_transform(corpus).toarray()

【问题讨论】：

【参考方案1】：

希望它会起作用

feature=cv.get_feature_names()

现在您可以使用 pandas 分析

【讨论】：

以上是关于Pandas-文本分析的主要内容，如果未能解决你的问题，请参考以下文章