数据结构 05-树9 Huffman Codes (30 分)
Posted keiiha
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了数据结构 05-树9 Huffman Codes (30 分)相关的知识,希望对你有一定的参考价值。
In 1953, David A. Huffman published his paper "A Method for the Construction of Minimum-Redundancy Codes", and hence printed his name in the history of computer science. As a professor who gives the final exam problem on Huffman codes, I am encountering a big problem: the Huffman codes are NOT unique. For example, given a string "aaaxuaxz", we can observe that the frequencies of the characters \'a\', \'x\', \'u\' and \'z\' are 4, 2, 1 and 1, respectively. We may either encode the symbols as {\'a\'=0, \'x\'=10, \'u\'=110, \'z\'=111}, or in another way as {\'a\'=1, \'x\'=01, \'u\'=001, \'z\'=000}, both compress the string into 14 bits. Another set of code can be given as {\'a\'=0, \'x\'=11, \'u\'=100, \'z\'=101}, but {\'a\'=0, \'x\'=01, \'u\'=011, \'z\'=001} is NOT correct since "aaaxuaxz" and "aazuaxax" can both be decoded from the code 00001011001001. The students are submitting all kinds of codes, and I need a computer program to help me determine which ones are correct and which ones are not.
Input Specification:
Each input file contains one test case. For each case, the first line gives an integer N (2≤N≤63), then followed by a line that contains all the Ndistinct characters and their frequencies in the following format:
c[1] f[1] c[2] f[2] ... c[N] f[N]
where c[i]
is a character chosen from {\'0\' - \'9\', \'a\' - \'z\', \'A\' - \'Z\', \'_\'}, and f[i]
is the frequency of c[i]
and is an integer no more than 1000. The next line gives a positive integer M (≤1000), then followed by M student submissions. Each student submission consists of N lines, each in the format:
c[i] code[i]
where c[i]
is the i
-th character and code[i]
is an non-empty string of no more than 63 \'0\'s and \'1\'s.
Output Specification:
For each test case, print in each line either "Yes" if the student\'s submission is correct, or "No" if not.
Note: The optimal solution is not necessarily generated by Huffman algorithm. Any prefix code with code length being optimal is considered correct.
Sample Input:
7
A 1 B 1 C 1 D 3 E 3 F 6 G 6
4
A 00000
B 00001
C 0001
D 001
E 01
F 10
G 11
A 01010
B 01011
C 0100
D 011
E 10
F 11
G 00
A 000
B 001
C 010
D 011
E 100
F 101
G 110
A 00000
B 00001
C 0001
D 001
E 00
F 10
G 11
Sample Output:
Yes
Yes
No
No
参考文章 https://zhuanlan.zhihu.com/p/121684742
借助小顶堆建立huffmanTree ,计算WPL
然后用学生输入的huffmanCode 建立huffmanTree,
建树过程中,累加 (对应字符的频率*哈夫曼码长度) 得到学生的树的带权路径长度wplOfTest
如果结点存放的位置 非叶结点 或 已经存放了其他结点信息 记为错误huffmanCode, 跳过其他字符的哈夫曼码的处理
如果学生的huffmanCode可以生成huffmanTree, 则计算学生的带权路径长度 即wplOfTest与WPL是否相等, 不相等也是错误的huffmanCode
//小顶堆的建立 查找 #include <iostream> #include <vector> using namespace std; class tnode{ public: string c{""}; int f{0}; tnode* left{nullptr}; tnode* right{nullptr}; tnode()=default; tnode(string c_,int f_):c{c_},f{f_}{}; }; class minHeap{//小根堆 public: vector<tnode*> heap; minHeap(){ } int getSize(){ return heap.size(); } void build(vector<tnode*> list,int n){ for(int i=0;i<n;i++){ insertNode(list[i]); } } void insertNode(tnode* newnode){ heap.push_back(newnode); adjustFromBack(); } void insertNode(string c,int f){ tnode* newnode =new tnode{c,f}; heap.push_back(newnode); adjustFromBack(); } tnode* popMinNode(){ tnode* temp{nullptr}; if(heap.size()){ temp = heap.front(); swap(heap.front(), heap.back()); heap.pop_back(); adjustFromFront(); } return temp; } void adjustFromBack(){ for(int i=getSize()-1;i>=0&&getSize()>1;i--){ if(heap[i]->f<heap[(i-1)/2]->f){ swap(heap[i], heap[(i-1)/2]); } } } void adjustFromFront(){ for(int i=1;i<getSize()&&getSize()>1;i++){ if(heap[i]->f<heap[(i-1)/2]->f){ swap(heap[i], heap[(i-1)/2]); } } } }; class huffmanTree{ public: huffmanTree()=default; tnode* root; void create(minHeap &minheap){ tnode* n1,*n2,*newnode; int size=minheap.getSize(); for(int i=0;i<size-1;i++){ n1 = minheap.popMinNode(); n2 = minheap.popMinNode(); newnode = new tnode{"",n1->f+n2->f}; newnode->left = n1; newnode->right = n2; minheap.insertNode(newnode); } newnode = minheap.heap.front(); root=newnode; } int WPL(tnode* p,int depth){ if(!p->left&&!p->right) return p->f*depth; return WPL(p->left, depth+1) + WPL(p->right, depth+1); } int getWPL(){ return WPL(root, 0); } }; bool judge(int wpl,vector<tnode*>list){ bool flag=true; int wplOfTest{0}; string c,code; tnode* head=new tnode; tnode* tail=head; tnode* temp; for(int i=0;i<list.size();i++){ cin >> c >> code; if(!flag)continue;; wplOfTest+=(list[i]->f*code.size()); for(auto i=code.begin();i!=code.end();i++){ if(*i==\'0\'){ if(!tail->left){ temp = new tnode; tail->left=temp; } tail=tail->left; }else if(*i==\'1\'){ if(!tail->right){ temp=new tnode; tail->right=temp; } tail=tail->right; } if(tail->f){ flag=false; break; } } if(tail->left||tail->right){ flag=false; continue; } tail->f=list[i]->f; tail->c=list[i]->c; tail=head; } if(wplOfTest!=wpl)return false; else return flag; } int main(){ int n,m,WPL; string c; int f; cin >> n; vector<tnode*> alpha; for(int i=0;i<n;i++){ cin >> c >> f; tnode *newnode=new tnode{c,f}; alpha.push_back(newnode); } minHeap minheap; minheap.build(alpha,n); huffmanTree ht; ht.create(minheap); WPL=ht.getWPL(); cin >> m; for(int i=0;i<m;i++){ if(judge(WPL,alpha)){ cout << "Yes"<<endl; }else{ cout << "No"<<endl; } } return 0; }
以上是关于数据结构 05-树9 Huffman Codes (30 分)的主要内容,如果未能解决你的问题,请参考以下文章
数据结构树 —— 编程作业 11 :Huffman Codes
[Stanford Algorithms: Design and Analysis, Part 2] c25 HUFFMAN CODES