第二周-词频统计更新
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了第二周-词频统计更新相关的知识,希望对你有一定的参考价值。
词频统计功能新增:
HTTPS:https://git.coding.net/li_yuhuan/WordFrequency.git
SSH:[email protected]:li_yuhuan/WordFrequency.git
代码:
static void Main(string[] args) { string str = ""; int length = args.Length; switch (length) { case 0: { string line = Console.ReadLine(); Frequency(line); break; } case 1: { str = m_workPath + args[0] + ".txt"; if (File.Exists(str)) { LoadFile(str); DictionarySort(m_wordList); } break; } case 2: { if ("-s" == args[0]) { str = args[1]; if (File.Exists(str)) { LoadFile(str); DictionarySort(m_wordList); } } else if ("dir" == args[0]) { if (Directory.Exists(args[1])) { m_top = 10; m_pathList.AddRange(Directory.GetFiles(args[1], "*.txt")); int index; foreach (string path in m_pathList) { index = path.LastIndexOf("\\"); if (index > 0) { string name = path.Substring(index + 1, path.Length - index - 1); Console.WriteLine(name); } LoadFile(path); DictionarySort(m_wordList); } } } break; } default: { break; } } }
判断通过控制台传入主函数的参数个数分情况处理;
1.没有参数时,则统计输入的一段文字中的单词总数及频次并排序;
2.有一个参数时判断当前的工作目录下是否存在为该名字的txt文件,存在则统计文件中单词总数及频次并排序;
3.有两个参数:
1)当参数为-s + 文件时,判断文件是否存在,统计单词总数频次并排序;
2)当参数为dir + 路径,判断路径是否存在,分别统计路径中所有txt文件中单词总数,频次,并排序;
--------------------------------------------------------------------------------------------------------------------------
static private void LoadFile(string filepath) { string line = string.Empty; using (StreamReader reader = new StreamReader(filepath)) { line = reader.ReadLine(); while (line != null) { Frequency(line); line = reader.ReadLine(); } } }
按行读取文件并对该行进行统计;
--------------------------------------------------------------------------------------------------------------------------
static private void Frequency(string line) { List<string> words = new List<string>(); string word = string.Empty; char[] split = { ‘ ‘, ‘,‘, ‘?‘, ‘?‘, ‘.‘, ‘。‘, ‘-‘, ‘—‘, ‘"‘, ‘:‘, ‘:‘, ‘\r‘, ‘\n‘, ‘(‘, ‘)‘, ‘“‘, ‘”‘ }; words.AddRange(line.Split(split)); foreach (string str in words) { if (str != "" && str != null) { word = str.ToLower(); if (m_wordList.ContainsKey(word)) { m_wordList[word] += 1; } else { m_wordList.Add(word, 1); } } } }
对传入的一行进行分割,存入list,遍历list进行比较统计,数据存入Dictionary;
--------------------------------------------------------------------------------------------------------------------------
static private void DictionarySort(Dictionary<string, int> dictionary) { if (dictionary.Count > 0) { List<KeyValuePair<string, int>> lst = new List<KeyValuePair<string, int>>(dictionary); lst.Sort(delegate (KeyValuePair<string, int> s1, KeyValuePair<string, int> s2) { return s2.Value.CompareTo(s1.Value); }); dictionary.Clear(); Console.WriteLine("total " + lst.Count + " words\n"); int k = 0; foreach (KeyValuePair<string, int> kvp in lst) { if (k < m_top) { Console.WriteLine(kvp.Key + ":" + kvp.Value); k++; } } Console.WriteLine("----------------------------\n"); } } }
把dictionary中的键值对存入list,利用list进行排序;
--------------------------------------------------------------------------------------------------------------------------
运行示例:
功能1:
--------------------------------------------------------------------------------------------------------------------------
功能2:
--------------------------------------------------------------------------------------------------------------------------
功能3:
--------------------------------------------------------------------------------------------------------------------------
功能4:(未完成)