试图理解这个动态递归子集和的时间复杂度

Posted 2023-03-25

技术标签:

【中文标题】试图理解这个动态递归子集和的时间复杂度【英文标题】：Trying to understand the time complexity of this dynamic recursive subset sum 【发布时间】：2022-01-07 16:09:39 【问题描述】：

# Returns true if there exists a subsequence of `A[0…n]` with the given sum
def subsetSum(A, n, k, lookup):
 
    # return true if the sum becomes 0 (subset found)
    if k == 0:
        return True
 
    # base case: no items left, or sum becomes negative
    if n < 0 or k < 0:
        return False
 
    # construct a unique key from dynamic elements of the input
    key = (n, k)
 
    # if the subproblem is seen for the first time, solve it and
    # store its result in a dictionary
    if key not in lookup:
 
        # Case 1. Include the current item `A[n]` in the subset and recur
        # for the remaining items `n-1` with the decreased total `k-A[n]`
        include = subsetSum(A, n - 1, k - A[n], lookup)
 
        # Case 2. Exclude the current item `A[n]` from the subset and recur for
        # the remaining items `n-1`
        exclude = subsetSum(A, n - 1, k, lookup)
 
        # assign true if we get subset by including or excluding the current item
        lookup[key] = include or exclude
 
    # return solution to the current subproblem
    return lookup[key]
 
 
if __name__ == '__main__':
 
    # Input: a set of items and a sum
    A = [7, 3, 2, 5, 8]
    k = 14
 
    # create a dictionary to store solutions to subproblems
    lookup = 
 
    if subsetSum(A, len(A) - 1, k, lookup):
        print('Subsequence with the given sum exists')
    else:
        print('Subsequence with the given sum does not exist')

据说这个算法的复杂度是O(n * sum)，但是我不明白如何或为什么；有人能帮我吗？可能是冗长的解释或递归关系，什么都可以

【问题讨论】：

哎呀。肯定是 O(n * k)，但我不确定如何证明。好问题！ 【参考方案1】：

我能给出的最简单的解释是，当lookup[(n, k)] 有一个值时，它是 True 或 False，并指示 A[:n+1] 的某个子集是否与 k 相加。

想象一个简单的算法，它只逐行填充查找的所有元素。

lookup[(0, i)]（对于 0 ≤ i ≤ total）只有两个元素为真，i = A[0] 和i = 0，所有其他元素都是假的。

lookup[(1, i)]（对于 0 ≤ i ≤ total）如果lookup[(0, i)] 为真或i ≥ A[1] 和lookup[(0, i - A[1]) 为真，则为真。我可以通过使用A[i] 或不使用i 来达到总和，并且我已经计算了这两个。

... 如果lookup[(r - 1, i)] 为真或i ≥ A[r] 和lookup[(r - 1, i - A[r]) 为真，lookup[(r, i)]（对于 0 ≤ i ≤ total）为真。

以这种方式填充此表，很明显我们可以在时间len(A) * total 中完全填充行0 ≤ row < len(A) 的查找表，因为线性填充每个元素。而我们最终的答案就是检查表中是否有(len(A) - 1, sum) True。

您的程序正在执行完全相同的操作，但会根据需要计算 lookup 条目的值。

【讨论】：

【参考方案2】：

很抱歉提交了两个答案。我想我想出了一个稍微简单的解释。

想象一下您的代码将if key not in lookup: 中的三行放入一个单独的函数calculateLookup(A, n, k, lookup)。我将调用“为n 和k 调用n 和k 的特定值调用calculateLookup 的成本是调用calculateLookup(A, n, k, loopup) 所花费的总时间，但是排除对calculateLookup的任何递归调用。

关键的见解是，如上所述，为任何n 和k 调用calculateLookup() 的成本为O(1)。由于我们在成本中排除了递归调用，并且没有 for 循环，所以 calculateLookup 的成本是仅执行几个测试的成本。

整个算法做固定量的工作，调用calculateLookup，然后做少量的工作。因此，在我们的代码中花费的时间与询问我们调用calculateLookup 的次数相同？

现在我们回到之前的答案。由于查找表，对calculateLookup 的每次调用都使用不同的(n, k) 值调用。我们还知道，在每次调用 calculateLookup 之前，我们都会检查 n 和 k 的边界，所以 1 ≤ k ≤ sum 和 0 ≤ n ≤ len(A)。所以calculateLookup 最多被调用(len(A) * sum) 次。

一般来说，对于这些使用memoization/cacheing的算法，最简单的做法就是分别计算然后求和：

假设您需要的所有值都已缓存，需要多长时间。填充缓存需要多长时间。

您提出的算法只是填满了lookup 缓存。它以不寻常的顺序执行它，并且它没有填充表中的每个条目，但这就是它的全部。

代码会稍微快一点

lookup[key] =  subsetSum(A, n - 1, k - A[n], lookup) or subsetSum(A, n - 1, k, lookup)

在最坏的情况下不会改变代码的O()，但可以避免一些不必要的计算。

【讨论】：

从未收到 OP 的回复。这能回答你的问题吗？

以上是关于试图理解这个动态递归子集和的时间复杂度的主要内容，如果未能解决你的问题，请参考以下文章