c_cpp 从一串单词中删除所有重复的单词。不只是重复,而是需要删除重复项的所有实例。

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了c_cpp 从一串单词中删除所有重复的单词。不只是重复,而是需要删除重复项的所有实例。相关的知识,希望对你有一定的参考价值。

#include <string>
#include <unordered_map>
#include <iostream>
using namespace std;

/* 
Remove all duplicate words from a string of words. Not just the duplicate but all the instances of the duplicates need to be removed.
idea: put every word into hashtable, and count its frequency. Then, scan sentence again, if hash count > 1, ignore this word
*/

bool is_letter(char c) {
    return c >= 'a' && c <= 'z' || c >= 'A' && c <= 'Z';
}

string remove_all_duplicated_words(string &s) {
    if(s.empty()) return "";
    unordered_map<string, int> hash;
    
    int i = 0, N = s.size();
    
    while(i < N) {
        while(i < N && is_letter(s[i]) == false) i++;
        if(i == N) break;
        
        int word_begin = i, word_end = i;
        while(word_end < N && is_letter(s[word_end])) word_end++;
        string word = s.substr(word_begin, word_end - word_begin);
        
        hash[word]++;
        i = word_end;
    }
    if(hash.empty()) return s;
    //for(auto s : hash) cout << s.first << endl;
    
    i = 0;
    int tail = 0;
    while(i < N) {
        while(i < N && is_letter(s[i]) == false) {
            //i++;
            //tail++;
            s[tail++] = s[i++];
        }
        int word_begin = i, word_end = i;
        while(word_end < N && is_letter(s[word_end])) word_end++;
        string word = s.substr(word_begin, word_end - word_begin);
        
        if(hash[word] == 1) {
            int j = 0;
            while(j < word.size()) s[tail++] = word[j++];
        } 
        i = word_end;
    }
    return s.substr(0, tail);
}

int main() {
    string s = "a b ab ab b";
    cout << "[" << remove_all_duplicated_words(s) << "]";
}

以上是关于c_cpp 从一串单词中删除所有重复的单词。不只是重复,而是需要删除重复项的所有实例。的主要内容,如果未能解决你的问题,请参考以下文章

JavaScript 数组删除重复的单词或字符(如果只输入字符。不要从 1 个单词中删除所有重复项

将一串空格分隔的单词拆分为多行[重复]

基于另一列从一列中删除单词,然后创建并将其放入新列

在 PHP 中查找重复的单词而不指定单词本身

删除重复的单词、逗号和空格

删除字符串中出现的重复单词