分而治之的策略来确定列表中是不是有超过 1/3 的相同元素
Posted
技术标签:
【中文标题】分而治之的策略来确定列表中是不是有超过 1/3 的相同元素【英文标题】:Divide and Conquer strategy to determine if more than 1/3 same element in list分而治之的策略来确定列表中是否有超过 1/3 的相同元素 【发布时间】:2020-01-07 06:22:18 【问题描述】:我正在使用一种分而治之的算法来确定列表中是否有超过 1/3 的元素相同。 例如:[1,2,3,4] 不,所有元素都是唯一的。 [1,1,2,4,5] 是的,有两个是一样的。
如果没有排序,是否有分而治之的策略? 我被困在如何划分...
def is_valid(ids):
n = len(ids)
is_valid_recur(ids, n n-1)
def is_valid_recur(ids, l, r):
m = (l + h) // 2
return .... is_valid_recur(ids, l, m) ...is_valid_recur(ids, m+1, r):
非常感谢!
【问题讨论】:
那么幼稚的方法是什么:计数直到任何计数超过len(ids)//2
?
这个算法怎么样:users.eecs.northwestern.edu/~dda902/336/hw4-sol.pdf 用于寻找多数元素?
这是作业吗?
如果两个项目各占列表的 1/3 以上怎么办:[1,1,1,1,1,2,2,2,2,3]?
@user448810 只要一个元素超过1/3就应该返回true
【参考方案1】:
您可以使用二叉搜索树 (BST)。 1.创建BST维护每个节点的密钥计数 2. 使用分治法遍历树以找到最大键数 3. 测试最大计数是否 > n/3 使用 BST 中的数据,分而治之很简单,因为我们只是 必须确定左、当前或右分支是否具有最高的重复次数。
# A utility function to create a new BST node
class newNode:
# Constructor to create a new node
def __init__(self, data):
self.key = data
self.count = 1
self.left = None
self.right = None
# A utility function to insert a new node
# with given key in BST
def insert(node, key):
# If the tree is empty, return a new node
if node == None:
k = newNode(key)
return k
# If key already exists in BST, increment
# count and return
if key == node.key:
(node.count) += 1
return node
# Otherwise, recur down the tree
if key < node.key:
node.left = insert(node.left, key)
else:
node.right = insert(node.right, key)
# return the (unchanged) node pointer
return node
# Finds the node with the highest count in a binary search tree
def MaxCount(node):
if node == None:
return 0, None
else:
left = MaxCount(node.left)
right = MaxCount(node.right)
current = node.count, node
return max([left, right, current], key=lambda x: x[0])
def generateBST(a):
root = None
for x in a:
root = insert(root, x)
return root
# Driver Code
if __name__ == '__main__':
a = [1, 2, 3, 1, 1]
root = generateBST(a)
cnt, node = MaxCount(root)
if cnt >= (len(a) // 3):
print(node.key) # Prints 1
else:
print(None)
n/3 的非分而治之技术,从https://www.geeksforgeeks.org/n3-repeated-number-array-o1-space/ 有 O(n) 时间:
# Python 3 program to find if
# any element appears more than
# n/3.
import sys
def appearsNBy3(arr, n):
count1 = 0
count2 = 0
first = sys.maxsize
second = sys.maxsize
for i in range(0, n):
# if this element is
# previously seen,
# increment count1.
if (first == arr[i]):
count1 += 1
# if this element is
# previously seen,
# increment count2.
elif (second == arr[i]):
count2 += 1
elif (count1 == 0):
count1 += 1
first = arr[i]
elif (count2 == 0):
count2 += 1
second = arr[i]
# if current element is
# different from both
# the previously seen
# variables, decrement
# both the counts.
else:
count1 -= 1
count2 -= 1
count1 = 0
count2 = 0
# Again traverse the array
# and find the actual counts.
for i in range(0, n):
if (arr[i] == first):
count1 += 1
elif (arr[i] == second):
count2 += 1
if (count1 > n / 3):
return first
if (count2 > n / 3):
return second
return -1
# Driver code
arr = [1, 2, 3, 1, 1 ]
n = len(arr)
print(appearsNBy3(arr, n))
【讨论】:
谢谢!我还希望算法确定列表是否有超过 1/3 的共同元素。我不确定如何修改算法.. @chen 也没有看到,所以我使用不同的算法修改了我的帖子。 谢谢.. 但是,这个算法似乎不是分而治之的。我正在尝试修改您以前的算法.. @chen 我对多数算法做了一个修改,为 n/3 而不是 n/2 生成一个。查看最近的帖子。 @DarrylG 试试 [2, 2, 1, 3,3,1,4,4,1]【参考方案2】:这是我为了好玩而尝试的草稿。看起来分而治之可能会减少候选频率检查的数量,但我不确定(参见最后一个示例,其中仅针对完整列表检查 0)。
如果我们将列表分成三份,有效候选人可以拥有的最小频率是每个部分的 1/3。这缩小了我们在其他部分中搜索的候选列表。让f(A, l, r)
代表在其父组中频率可能为 1/3 或更高的候选人。那么:
from math import ceil
def f(A, l, r):
length = r - l + 1
if length <= 3:
candidates = A[l:r+1]
print "l, r, candidates: %s, %s, %s\n" % (l, r, candidates)
return candidates
i = 0
j = 0
third = length // 3
lg_third = int(ceil(length / float(3)))
sm_third = lg_third // 3
if length % 3 == 1:
i, j = l + third, l + 2 * third
elif length % 3 == 2:
i, j = l + third, l + 2 * third + 1
else:
i, j = l + third - 1, l + 2 * third - 1
left_candidates = f(A, l, i)
middle_candidates = f(A, i + 1, j)
right_candidates = f(A, j + 1, r)
print "length: %s, sm_third: %s, lg_third: %s" % (length, sm_third, lg_third)
print "Candidate parts: %s, %s, %s" % (left_candidates, middle_candidates, right_candidates)
left_part = A[l:i+1]
middle_part = A[i+1:j+1]
right_part = A[j+1:r+1]
candidates = []
seen = []
for e in left_candidates:
if e in seen or e in candidates:
continue
seen.append(e)
count = left_part.count(e)
if count >= lg_third:
candidates.append(e)
else:
middle_part_count = middle_part.count(e)
print "Left: counting %s in middle: %s" % (e, middle_part_count)
if middle_part_count >= sm_third:
count = count + middle_part_count
right_part_count = right_part.count(e)
print "Left: counting %s in right: %s" % (e, right_part_count)
if right_part_count >= sm_third:
count = count + right_part_count
if count >= lg_third:
candidates.append(e)
seen = []
for e in middle_candidates:
if e in seen or e in candidates:
continue
seen.append(e)
count = middle_part.count(e)
if count >= lg_third:
candidates.append(e)
else:
left_part_count = left_part.count(e)
print "Middle: counting %s in left: %s" % (e, left_part_count)
if left_part_count >= sm_third:
count = count + left_part_count
right_part_count = right_part.count(e)
print "Middle: counting %s in right: %s" % (e, right_part_count)
if right_part_count >= sm_third:
count = count + right_part_count
if count >= lg_third:
candidates.append(e)
seen = []
for e in right_candidates:
if e in seen or e in candidates:
continue
seen.append(e)
count = right_part.count(e)
if count >= lg_third:
candidates.append(e)
else:
left_part_count = left_part.count(e)
print "Right: counting %s in left: %s" % (e, left_part_count)
if left_part_count >= sm_third:
count = count + left_part_count
middle_part_count = middle_part.count(e)
print "Right: counting %s in middle: %s" % (e, middle_part_count)
if middle_part_count >= sm_third:
count = count + middle_part_count
if count >= lg_third:
candidates.append(e)
print "l, r, candidates: %s, %s, %s\n" % (l, r, candidates)
return candidates
#A = [1, 1, 2, 4, 5]
#A = [1, 2, 3, 1, 2, 3, 1, 2, 3]
#A = [1, 1, 1, 1, 1, 2, 2, 2, 2, 3]
A = [2, 2, 1, 3, 3, 1, 4, 4, 1]
#A = [x for x in range(1, 13)] + [0] * 6
print f(A, 0, len(A) - 1)
【讨论】:
以上是关于分而治之的策略来确定列表中是不是有超过 1/3 的相同元素的主要内容,如果未能解决你的问题,请参考以下文章