如何将一个数组划分为 K 个子数组，以使所有子数组中重复元素的数量之和最小？

Posted 2023-02-14

技术标签:

【中文标题】如何将一个数组划分为 K 个子数组，以使所有子数组中重复元素的数量之和最小？【英文标题】：How can I divide an array into K sub-arrays such that the sum of the number of duplicate elements in all the sub-array is minimum? 【发布时间】：2020-11-29 10:23:29 【问题描述】：

例如，让数组为A=1,1,2,1,2 和K=3。 A可分为1、1,2和1,2。因此，这些子数组中重复元素的数量为 0、0 和 0。因此，它们的总和为 0。

【问题讨论】：

这不是作业问题。实际问题不同。如果我知道这是一个子问题，我已经想过解决这个问题。所以，提出的问题是解决实际问题的中间思想。 【参考方案1】：

这是一个非常有趣的挑战。我使用 java 来说明我的方法。

将问题分解 我已将整个问题拆分为更小的部分：

10 elements

k = 3

3, 3, 4

1 + 2 - 将数组分成相等的部分 我已经用array of length 10 和k = 3 做了这个例子。由于除法给出的余数，子数组将是length 3, 3 and 4。

在 sn-p 中，我确保用0 填充数组，每个子数组将有0 to 1 额外元素。如果有余数，多余的元素将在所有子数组上拆分。

在我的示例中，我使用了 array with length of 13 和 k = 3，所以它看起来像这样：

[[0, 0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]]

3 - 减少重复的策略 我在想我们可以从分析给定的数组开始。我们可以通过计数找出每个单独的数字存在多少次。一旦我们知道这些数字存在多少次，我们就可以按值对地图进行排序，最后得到一个地图，其中包含每个数字的出现次数，并从最高出现次数开始。

在我的例子中：

1=4 // contains 1 exactly 4 times
2=4 // contains 2 exactly 4 times
3=3 // ...
4=1
5=1

我们从中得到什么？好吧，我们肯定知道，我们不希望所有这些1s 在同一个子数组中，因此我们的想法是平均分割所有子数组上的所有出现。如果我们最终得到4x 1 和4x 2 和k = 3（如上例所示），那么我们可以将1 和2 放入每个子数组中。这让我们每个人都有 1 个重复项（一个额外的 1 和一个额外的 2）。

在我的示例中，这看起来像：

[[1, 2, 3, 4, 5], [1, 2, 3, 0], [1, 2, 3, 0]]
// 1 used 3 times => 1 duplicate
// 2 used 3 times => 1 duplicate
// 3 used 3 times => ok
// 4 used 1 time  => ok
// 5 used 1 time  => ok

为此，我们循环遍历出现映射，添加键并跟踪我们可以使用的剩余数字（在 sn-p 中，这是使用映射）。

我们可以对每个键执行此操作，直到只剩下重复项。此时子数组只包含唯一的数字。现在对于重复项，我们可以再次重复整个过程，并将它们平均分配到尚未完全填充的子数组上。

最后是这样的：

// the duplicate 1 got placed in the second subarray
// the duplicate 2 got placed in the third subarray
[[1, 2, 3, 4, 5], [1, 2, 3, 1], [1, 2, 3, 2]]

Java 代码 我不确定你能走多远以及它的表现如何。至少在我做的一些测试中，它似乎工作得很好。您可能会找到一个性能更高的解决方案，但我可以想象，这是解决此问题的一种方法。

无论如何，这是我的尝试：

public static void main(String args[]) 
    final List<Integer> list = Arrays.asList(1, 2, 3, 1, 3, 4, 3, 5, 1, 2, 1, 2, 2);
    final Map<Integer, Integer> occurrenceMap = findOccurrences(list);
    final Map<Integer, Integer> occurrenceMapSorted = occurrenceMap;
    occurrenceMapSorted.entrySet().stream()
        .sorted(Map.Entry.comparingByValue(Comparator.reverseOrder()))
        .forEach(System.out::println);
    
    final List<List<Integer>> sublists = setupSublists(list.size(), 3);
    System.out.println(sublists);
    
    final Map<Integer, Integer> usageMap = new HashMap<>(occurrenceMapSorted.size());
    
    for (int i = 0; i < sublists.size(); i++) 
        final List<Integer> sublist = sublists.get(i);
        populateSublist(occurrenceMapSorted, usageMap, sublist);
    
    
    System.out.println(sublists);


public static void populateSublist(Map<Integer, Integer> occurrenceMapSorted, Map<Integer, Integer> usageMap, List<Integer> sublist) 
    int i = 0;
    int skipp = 0;
    while (i < sublist.size() && sublist.get(i) == 0) 
        for (Map.Entry<Integer, Integer> entry : occurrenceMapSorted.entrySet()) 
            if (skipp > 0) 
                skipp--;
                continue;
            
            final int entryKey = entry.getKey();
            final Integer usageCount = usageMap.getOrDefault(entryKey, null);
            if (usageCount == null || usageCount < entry.getValue()) 
                if (usageCount == null) 
                    usageMap.put(entryKey, 1);
                 else 
                    usageMap.put(entryKey, usageCount + 1);
                
                
                sublist.set(i, entryKey);
                System.out.println("i: " + i);
                System.out.println("sublist: " + sublist);
                
                System.out.println("usage: ");
                usageMap.entrySet().stream()
                    .sorted(Map.Entry.comparingByValue(Comparator.reverseOrder()))
                    .forEach(System.out::println);
                System.out.println();
                
                i++;
                skipp = i;
                break;
            
        
    


public static List<List<Integer>> setupSublists(int listLength, int numberOfSublists) 
    if (numberOfSublists <= 1 || numberOfSublists > listLength) 
        throw new IllegalArgumentException("Number of sublists is greater than the number of elements in the list or the sublist count is less or equal to 1.");
    
    final List<List<Integer>> result = new ArrayList<>(numberOfSublists);
    final int minElementCount = listLength / numberOfSublists;
    int remainder = listLength % numberOfSublists;
    for (int i = 0; i < numberOfSublists; i++) 
        final List<Integer> sublist = new ArrayList();
        boolean addRemainder = true;
        for (int j = 0; j < minElementCount; j++) 
            sublist.add(0);
            if (remainder > 0 && addRemainder) 
                sublist.add(0);
                addRemainder = false;
                remainder--;
            
        
        result.add(sublist);
    
    return result;


public static Map<Integer, Integer> findOccurrences(List<Integer> list) 
    final Map<Integer, Integer> result = new HashMap();
    for (int i = 0; i < list.size(); i++) 
        final int listElement = list.get(i);
        final Integer entry = result.getOrDefault(listElement, null);
        if (entry == null) 
            result.put(listElement, 1);
         else 
            result.put(listElement, entry.intValue() + 1);
        
    
    return result;

【讨论】：

【参考方案2】：

让dp[i][k] 表示最佳拆分为k 子数组，直到ith 索引。如果A[i] 没有出现在我们刚刚选择的最后一个子数组中，那么如果我们追加它，最优解不会改变。否则，我们的选择是开始一个新的子数组，或者缩短之前选择的子数组，直到它通过其中最左边出现的A[i]，然后看看是否更好。

如果我们把它往后延伸；首先，我们已经通过添加A[i] 将最优解增加了1；如果我们以前有一个比 1 小（从而补偿加法）的可能性（最多 A[i-1][k]），我们就会从那个开始。

为了计算新的可能性，当前kth 子数组的左边界正好在A[i] 最左边出现的右侧，我们可以在O(log n) 中找出不同值的数量建议kth 子数组和建议(k-1)th 子数组（小波树是一种选择），然后从每个子数组中的元素总数中减去这些计数。

【讨论】：

以上是关于如何将一个数组划分为 K 个子数组，以使所有子数组中重复元素的数量之和最小？的主要内容，如果未能解决你的问题，请参考以下文章