PHP 字频率计数

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了PHP 字频率计数相关的知识,希望对你有一定的参考价值。

<?php

$filename = "largefile.txt";

/* get content of $filename in $content */
$content = strtolower(file_get_contents($filename));

/* split $content into array of substrings of $content i.e wordwise */
$wordArray = preg_split('/[^a-z]/', $content, -1, PREG_SPLIT_NO_EMPTY);

/* "stop words", filter them */
$filteredArray = array_filter($wordArray, function($x){
       return 		!preg_match("/^(.|a|an|and|the|this|at|in|or|of|is|for|to)$/",$x);
     });
	 
/* get associative array of values from $filteredArray as keys and their frequency count as value */
$wordFrequencyArray = array_count_values($filteredArray);

/* Sort array from higher to lower, keeping keys */
arsort($wordFrequencyArray);

/* grab Top 10, huh sorted? */
$top10words = array_slice($wordFrequencyArray,0,10);

/* display them */
foreach ($top10words as $topWord => $frequency)
    echo "$topWord --  $frequency<br/>";

?>

以上是关于PHP 字频率计数的主要内容,如果未能解决你的问题,请参考以下文章

使用字典在Python中计算字频率效率

基于汉字字频特征实现99.99%准确率的新闻文本分类器

基于汉字字频特征实现99.99%准确率的新闻文本分类器

基于汉字字频特征实现99.99%准确率的新闻文本分类器

javascript 字频#js

javascript 字频#js