“在字符矩阵中搜索词/字符串”算法的复杂性

Posted 2023-02-23

技术标签:

【中文标题】“在字符矩阵中搜索词/字符串”算法的复杂性【英文标题】：Complexity of the "Search words/strings in Matrix of Char" Algorithm 【发布时间】：2015-10-07 09:01:00 【问题描述】：

我的任务是从列表中搜索字母 (20×20 <= MxN <= 1000×1000) 单词 (5 <= length <= 100) 的网格。任何隐藏在网格中的词总是以之字形线段的形式出现，其长度可能只有 2 或 3。之字形线段只能从左到右或从下到上。

所需的复杂度等于网格中字母数与列表中字母数的乘积。

对于网格：

••••••••••••••••••••
••••••••ate•••••x•••
•••••••er•••••••e•••
••••••it••••••••v•••
••••ell••••••a••f•••
•••at••••e••••••rbg•
•••s•••••••ga•••••••

和单词列表"forward", "iterate", "phone", "satellite" 输出将是

3,6,iterate
6,3,satellite

我是在C++: 我将所有前缀和单词保存在unordered_map<string, int> 中，其中key 是前缀/单词，value 是前缀 1 和单词 2。现在我做这样的事情（伪代码）：

for (char c in grid)
    check(c + "");

地点：

check(string s) 
    if s is key in unsorted_map 
        if (value[s] == 2) //it's a word
            print s; //and position
        if (not up 3 time consecutive) //limit the segments <= 3
            check(s + next_up_char_from_grid);
        if (not right 3 time consecutive)
            check(s + next_right_char_from_grid);

此实现非常适用于网格中的随机字符和字典中的单词，但复杂度 C ≃ O(M * N * 2^K) > O(M * N * R) 由于长度段的限制，更好的近似 C ≃ O(M * N * (1,6)^K)

M * N = number of chars in grid
K = the maximum length of any word from list (5 <= K <= 100)
R = number of chars in list of words

最坏情况：最大网格、最大字长以及网格和单词中相同的单个字符如何归档所需的复杂性？只有在给定的限制下才有可能？

【问题讨论】：

您的复杂性中的 K 是什么？ K = 列表(5 <= K <= 100) 中任何单词的最大长度。对于"forward", "iterate", "phone", "satellite" K = strlen("satellite") = 9 我猜你需要一种算法来解决单词搜索难题？ 【参考方案1】：

您的check() 函数将执行许多重复工作。

对于网格

•aa
ab•
aa•

和字'aabaa'

有两种方法可以得到'aabaa'，在字母'b'之后是一样的

（上、右、上、右）或（右、上、上、右）

根据这个 trait，我们使用数组 a[position][n][m] 来记录特定单词的长度为 position 的前缀是否可以在网格 [m, n] 处获得

对于前面的例子，遵循这样的顺序

a[0][2][0] = true
a[1][1][0] = a[1][2][1] = true
a[2][1][1] = true
a[3][0][1] = true
a[4][0][2] = true

'aabaa' 可以在研磨中找到

所以复杂度将是N*M*K*S

S是列表中的单词数

【讨论】：

以上是关于“在字符矩阵中搜索词/字符串”算法的复杂性的主要内容，如果未能解决你的问题，请参考以下文章