r语言已将数据中出现最多的50个单词和出现次数列出，怎么排序？

Posted 2023-05-09

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了r语言已将数据中出现最多的50个单词和出现次数列出，怎么排序？相关的知识，希望对你有一定的参考价值。

teens <- read.csv("C:/Users/ASUS/Desktop/snsdata.csv")str(teens)interests <- teens[5:40]sapply(teens[5:40],max)文件请联系我，麻烦您了，谢谢

假设你的文件名字是date
answer<-names(date)[order(date)]
就行了
得到的就是根据从小到大的排列顺序。参考技术A 用table统计频率，再转化为数据框，最后order排序就好了

求字符串中出现次数最多的子串及其出现次数

题目描述：

求字符串中出现次数最多的子串的出现次数。
例如字符串abcbcbcabc，出现次数最多的子串是bc，出现次数为4

思路：利用后缀数组：

abcbcbcabc ? ?第0个
bcbcbcabc ? ?第1个
cbcbcabc ? ?第2个
bcbcabc ? ?第3个
cbcabc ? ?第4个
bcabc ? ?第5个
cabc ? ?第6个
abc ? ?第7个
bc ? ?第8个
c ? ?第9个

过程：先从第0个数组取出a,然后和第1个数组的b比较，发现不相等，然后取ab，abc,abcb....这一趟取完后，又从第1个后缀数组开始取，取b,bc,bcb,bcbc...

#include<iostream>
#include<string>
#include<vector>

using namespace std;

pair<int, string> fun(const string &str)

    vector<string> substrs;
    int len = str.length();

    string substring;
    int maxcount(0);
    //后缀数组
    cout << "the string is:" << str << endl;
    cout << "the substrings are as follows:" << endl;
    for (int i = 0; i < len; ++i)
    
        substrs.push_back(str.substr(i));
        cout << substrs[i] << endl;
    

    cout << "--------------the answer------------" << endl;

    for (int i = 0; i < len; ++i)
    
        for (int j = 1; j <= len; j++) 
            int count = 1;
            int sublen = j;
            for (int k = i + 1; k < len; k++) 
                
                if (substrs[k].length() < sublen) 
                    break;
                
                //cout << substrs[i].substr(0, sublen) << endl;
                //cout << substrs[k].substr(0, sublen) << endl;
                string str1 = substrs[i].substr(0, sublen);
                string str2 = substrs[k].substr(0, sublen);
                //cout << "比较结果：" << str1.compare(str2) << endl;
                //cout << "i = " << i << "  sublen = " << j << "  k = " << k << endl;
                if (str1.compare(str2)==0)
                
                    ++count;
                
                //cout << "count = " << count << endl;
            
            if (count > maxcount||(count == maxcount && sublen > substring.length()))
            
                maxcount = count;
                substring = substrs[i].substr(0, sublen);
            
        
    

    return make_pair(maxcount, substring);




int main()

    string str = "ababcababcabcab";
    auto res = fun(str);

    cout << "the max count is:" << res.first << endl;
    cout << "the matched substring is:" << res.second << endl;
    return 0;

输出：

the string is:ababcababcabcab
the substrings are as follows:
ababcababcabcab
babcababcabcab
abcababcabcab
bcababcabcab
cababcabcab
ababcabcab
babcabcab
abcabcab
bcabcab
cabcab
abcab
bcab
cab
ab
b
--------------the answer------------
the max count is:6
the matched substring is:ab

参考资料：

[1] 求一个字符串中连续出现次数最多的子串 https://blog.csdn.net/qq_22080999/article/details/81143555
[2] 【算法刷题】一个字符串中连续出现次数最多的子串 https://blog.csdn.net/Neo_dot/article/details/80559744

以上是关于r语言已将数据中出现最多的50个单词和出现次数列出，怎么排序？的主要内容，如果未能解决你的问题，请参考以下文章