多集和集混合的可能字符串排列

Posted 2023-03-31

技术标签:

【中文标题】多集和集混合的可能字符串排列【英文标题】：Possible string permutations of mixture of multiset and set 【发布时间】：2019-09-09 21:17:43 【问题描述】：

我正在尝试获取char* 的所有可能组合。该字符串由四个值组成：两个数字和两个不同字母。例如：

char *text = "01ab";

应该有

所以

我的示例字符串的不同组合，这似乎是正确的（手工完成）：

Combinations for values: 0, 1, a, b:

0 0 a b     1 1 a b     a 0 0 b     b 0 0 a
0 0 b a     1 1 b a     a 0 1 b     b 0 1 a
0 1 a b     1 0 a b     a 0 b 0     b 0 a 0
0 1 b a     1 0 b a     a 0 b 1     b 0 a 1
0 a 0 b     1 a 1 b     a 1 b 0     b 1 0 a
0 a 1 b     1 a 0 b     a 1 b 1     b 1 1 a
0 a b 0     1 a b 1     a 1 0 b     b 1 a 0
0 a b 1     1 a b 0     a 1 1 b     b 1 a 1
0 b 0 a     1 b 1 a     a b 0 0     b a 0 0
0 b 1 a     1 b 0 a     a b 0 1     b a 0 1
0 b a 0     1 b a 1     a b 1 0     b a 1 0
0 b a 1     1 b a 0     a b 0 0     b a 1 1

我的方法与我手工操作的方法相同：获取所有与text 的第一个索引 的组合，然后是text 的第二个索引 的所有组合，依此类推。所以是这样的：

void printPasswordCombinations()

    char *all_values = "01ab";
    int len = strlen(all_values);

    char *tmp_pwd = malloc(sizeof(len) * sizeof(char));

    for(int i=0 ; i<len ; i++)
    
        tmp_pwd[0] = all_values[i];

        /* len-1, since the first index is already set. */
        for(int j=0 ; j<len-1 ; j++)
        

        
    

    printf("%s\n", tmp_pwd);
    free(tmp_pwd);

现在我有点困惑如何在组合的第一个索引之后继续。所有组合都有several examples，但我的问题似乎有点不同，因为组合中的数字可能相同，只有字母必须不同强>。

如何将所有组合打印到我的控制台？我实现了一个计算可能组合数量的函数，所以假设这已经完成了。

如果该算法适用于任何数量的numbers 和letters，那就太好了，例如，lenght 6 与four different numbers 和两个different letters 文本的所有组合也可以计算。

语言无关紧要，欢迎任何建议。

【问题讨论】：

你的问题的第一个问题在标题中。这些不是“字符指针”的组合，它们是“字符串的*排列”。当您知道搜索关键字时，您将使用 google 找到解决方案。 @AnttiHaapala，这个问题不是链接问题的重复。这里的操作是处理多重集的排列。 @JosephWood 嗯，对，问题不清楚。正在尝试查找重复项。 @MoritzSchmidt - multiset 和 set 的混合排列。 @AnttiHaapala，问题标题现在很清楚，而不是链接问题的副本。你会重新提出问题吗？ 【参考方案1】：

您的问题可以通过回溯策略来解决。它将创造所有可能的组合。

我知道你想删除重复的组合，以防两个数字相同，为了摆脱它们，你可以使用哈希表来存储生成的组合，然后，每次生成新的组合时，带上它到哈希表检查它是否生成（如果没有，将其输入到哈希表并打印出来，反之则忽略打印）。我的伪代码如下（你可以有更好的方法）：

val characters = [/*4 characters*/]
val picked = [false,false,false,false]
val hashtable = empty

function genetate(string aCombin):
    if aCombin.size == 4:
         if(hashtable.contains(aCombin)):
               //do nothing
         else:
               print(aCombin)
               hashtable.add(aCombin)
    for i in characters.size:
         if(picked[i]==false):
             picked[i]=true
             aCombin.add(characters[i])
             generate(aCombin)
             picked[i]=false //backtrack
             aCombine.popBack() //remove the last character

【讨论】：

【参考方案2】：

我使用 javascript 是因为它可以在浏览器中运行，而且语言无关紧要。下面的方法使用递归。尝试使用“0123ab”。

'use strict';

const input = '01ab';

const reLetters = /[^0-9]/g;
const reDigits = /[0-9]/g;
const nLetters = input.replace(reDigits, '').length;
const nDigits = input.replace(reLetters, '').length;

const findComb = cur => 
    if (cur.length === input.length)
        return console.log(cur);
    for (let l of input) 
        if (l.match(reDigits)) 
            if (cur.replace(reLetters, '').length === nDigits) continue;
         else 
            if (cur.match(l) || cur.replace(reDigits, '').length === nLetters) continue;
        
        findComb(cur + l);
    


findComb('');

这是一个没有“删除字母以计算数字”的版本。它的效率提高了约 20%。我使用 nodejs 和 '01234abc' 作为输入来测量。

'use strict';

const input = '01ab';

const reLetters = /[^0-9]/g;
const reDigits = /[0-9]/g;
const maxLetters = input.replace(reDigits, '').length;
const maxDigits = input.replace(reLetters, '').length;

const findComb = (cur = '', nDigits = 0, nLetters = 0) => 
    if (cur.length === input.length)
        return console.log(cur);
    for (let l of input) 
        if (l.match(reDigits)) 
            if (nDigits < maxDigits)
                findComb(cur + l, nDigits + 1, nLetters);
         else 
            if (cur.match(l)) continue;
            if (nLetters < maxLetters)
                findComb(cur + l, nDigits, nLetters + 1);
        
    


findComb();

这里没有递归。这是最慢的，但可以改进。

'use strict';

const input = '01ab';

const reLetters = /[^0-9]/g;
const reDigits = /[0-9]/g;
const nLetters = input.replace(reDigits, '').length;
const nDigits = input.replace(reLetters, '').length;

let cur = '', l = undefined;
do 
    l = input[input.indexOf(l) + 1];
    if (l !== undefined) 
        if (l.match(reDigits)) 
            if (cur.replace(reLetters, '').length === nDigits) continue;
         else 
            if (cur.match(l) || 
                cur.replace(reDigits, '').length === nLetters) continue;
        
        if (cur.length + 1 === input.length) 
            console.log(cur + l);
         else 
            cur = cur + l;
            l = undefined;
        
     else 
        l = cur[cur.length - 1];
        cur = cur.slice(0, -1);
    
 while (cur != '' || l != undefined);

【讨论】：

我选择你的答案是正确的，因为我可以很容易地测试它和代码的简短性。感谢您的努力。对了，你知道这种组合是怎么叫的吗？ @MoritzSchmidt 抱歉，我不知道。在另一个主题上，我将添加另一段更优化的代码。当前的“删除字母以计算数字”是愚蠢的想法，让我很烦恼。如果我想用更大的字母来解决这个问题，我的堆栈会爆炸 @MoritzSchmidt 你需要很长的字符串来爆炸堆栈，而且每个递归都可以写成一个循环，只需要更多的代码......我会发布一个有趣的【参考方案3】：

递归方法在这里是最简单的方法。假设您想要生成所有带有m 字母的字符串，它们都是不同的，取自letters[m] 数组和n 数字，可以重复，取自numbers[N] 数组（n可以更小，同样大小大于N，没关系）。你可以这样解决它（伪代码，C风格）：

void print_them_all(char *numbers, int nb_numbers_in_result, int n              \
                    char *letters, bool *is_letter_used, int nb_letters_in_result, int m,
                    char *current_string)
    if ((nb_numbers_in_result == n) && (nb_letters_in_result == m))
        // terminal case -> time to print the  current string
        printf("%s\n", current_string);
     else   
        // string not completely built yet
        // get the index where the next char will be added
        current_index = nb_letters_in_result + nb_numbers_in_result;
        if (nb_numbers_in_result < n)  // still possible to add a number
            for (int i = 0; i < N; i++)
                current_string[current_index] = numbers[i];
                print_them_all(numbers, nb_numbers_in_result+1, n,               \
                               letters, is_letter_used, nb_letters_in_result, m, \
                               current_string);
            
        
        if (nb_letters_in_result < m) // still possible to add a letter
             for (int i = 0; i < m; i++) 
                 if (is_letter_used[i] == false) // check the letter has not been added yet
                     // keep track that the letter has been added by 'marking' it
                     is_letter_used[i] = true;  
                     // add it
                     current_string[i] = letters[i];
                     // recursive call
                     print_them_all(numbers, nb_numbers_in_result, n,                   \
                                    letters, is_letter_used, nb_letters_in_result+1, m,  \ 
                                    current_string);
                     // now 'unmark' the letter
                     is_letter_used[i] = false;

为了解决这类问题，递归的方法是必要的。它的工作原理如下：如果我已经有一个带有k 数字的字符串k<n，那么我可以向其中添加任何数字，然后我可以继续（现在我的字符串中将有k+1 数字）。如果我已经有一个带有k 字母的字符串k<m，那么我可以添加任何尚未添加的字母（布尔数组有助于确保它是这种情况），我可以继续。如果我的字符串可以打印，请打印它。

第一次调用应该使用初始化为 false 的布尔数组，0 的值是 nb_letters_in_result 和 nb_numbers_in_result，因为您尚未在结果字符串中添加任何数字或字母. 至于你的结果字符串，因为你用 C 编码，不要忘记为它分配内存：

char *current_string = malloc((m+n+1) * sizeof(char));

并以空值终止它：

current_string[m+n] = '\0';

【讨论】：

【参考方案4】：

我还为我的问题找到了一个有趣的解决方案。假设我的示例字符串 01ab。

首先我们要创建数字01 和ab 的排列的所有组合。有很多例子可以说明如何解决这个问题。

所以现在我们有了01 和ab 的所有组合。我将它们称为生产者组合：

现在我们想将所有数字与所有字母结合起来，但要遵守规则

不能为每个组合保留数字或字母的顺序

所以如果我们将10 与ab 结合起来，我们会得到：

10ab
1a0b
a10b

现在我们将b 移动到左侧，直到它即将与a 交换位置，因为我的规则是禁止的。我们对每种组合都这样做：

10ab produces:

10ab

因为 b 已经在 a 旁边了。

1a0b produces:

1ab0

所以我们又多了一个组合。

a10b produces:

a1b0
ab10

所以我们又得到了 2 个组合。

现在我们有了01 and ab 的所有可能组合：

10ab
1a0b
a10b
1ab0
a1b0
ab10

由于我们的生产者组合包含 8 个元素，我们必须对所有元素执行此步骤8 次。生成的组合将始终包含 6 个元素，就像我的示例中一样，这导致我们总共有 48 元素，正如我在我的问题中计算的那样。

【讨论】：

这个算法有名字吗？

以上是关于多集和集混合的可能字符串排列的主要内容，如果未能解决你的问题，请参考以下文章