以spacy中函数调用为例记录对自然语言基本处理任务
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了以spacy中函数调用为例记录对自然语言基本处理任务相关的知识,希望对你有一定的参考价值。
# coding=utf-8 import spacy nlp=spacy.load(‘en_core_web_md-1.2.1‘) docx=nlp(u‘The ways to process documents are so varied and application- and language-dependent that I decided to not constrain them by any interface. Instead, a document is represented by the features extracted from it, not by its "surface" string form: how you get to the features is up to you. Below I describe one common, general-purpose approach (called bag-of-words), but keep in mind that different application domains call for different features, and, as always, it’s garbage in, garbage out...‘) ‘‘‘ 功能测试 ‘‘‘ #1.分词 tokenize print ‘#################tokenization‘ for token in docx: print token #2.词性标注 pos tagging print ‘#################part of speech tagging‘ for token in docx: print(token, token.pos_, token.pos) #3.命名实体识别 Named Entity Recognition print ‘################# Named Entity Recognition‘ for ent in docx.ents: print(ent,ent.label_,ent.label) #4.词干化 Lemmatize print ‘#################Lemmatize‘ for token in docx: print(token,token.lemma_,token.lemma) #5.名词短语提取 Noun Phrase Extraction print ‘#################Noun Phrase Extraction‘ for np in docx.noun_chunks: print np #6.断句 Sentence segmentation print ‘#################Sentence segmentation‘ for sent in docx.sents: print sent
以上是关于以spacy中函数调用为例记录对自然语言基本处理任务的主要内容,如果未能解决你的问题,请参考以下文章