[Javascript Natural] Break up language strings into parts using Natural

Posted Answer1215

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了[Javascript Natural] Break up language strings into parts using Natural相关的知识,希望对你有一定的参考价值。

A part of Natural Language Processing (NLP) is processing text by “tokenizing” language strings. This means we can break up a string of text into parts by word, sentence, etc. In this lesson, we will use the natural library to tokenize a string. First, we will break the string into words using WordTokenizerWordPunctTokenizer, and TreebankWordTokenizer. Then we will break the string into sentences using RegexpTokenizer.

 

var natural = require(natural),
  tokenizer = new natural.WordTokenizer();
console.log(tokenizer.tokenize("your dog has fleas."));
// [ ‘your‘, ‘dog‘, ‘has‘, ‘fleas‘ ] 

 

tokenizer = new natural.TreebankWordTokenizer();
console.log(tokenizer.tokenize("my dog hasn‘t any fleas."));
// [ ‘my‘, ‘dog‘, ‘has‘, ‘n\‘t‘, ‘any‘, ‘fleas‘, ‘.‘ ] 
 
tokenizer = new natural.RegexpTokenizer({pattern: /\-/});
console.log(tokenizer.tokenize("flea-dog"));
// [ ‘flea‘, ‘dog‘ ] 
 
tokenizer = new natural.WordPunctTokenizer();
console.log(tokenizer.tokenize("my dog hasn‘t any fleas."));
// [ ‘my‘,  ‘dog‘,  ‘hasn‘,  ‘\‘‘,  ‘t‘,  ‘any‘,  ‘fleas‘,  ‘.‘ ] 

 

以上是关于[Javascript Natural] Break up language strings into parts using Natural的主要内容,如果未能解决你的问题,请参考以下文章

[Javascript] Classify JSON text data with machine learning in Natural

[Javascript] Identify the most important words in a document using tf-idf in Natural

csharp ParallelLoopState.Brea演示

“,”“natural join”“natural left outer join”“natural right outer join”的用法总结

「Hiveel-精益“球”精」你不知道的队长故事︱Brea 小贤--徐佳楠Johnny

JavaScript根据国家二字码获取国家全称