亚马逊面试问题

Posted 2023-04-14

技术标签:

【中文标题】亚马逊面试问题【英文标题】：AMAZON Interview Question 【发布时间】：2011-08-30 23:01:42 【问题描述】：

给定一个大小为 K 的 N 个数组。这 N 个数组中的每一个 K 个元素都是排序的，这些 N*K 元素中的每一个都是唯一的。从 N 个元素的选定子集中，从 N 个数组中的每个数组中选择一个元素。减去最小和最大元素。现在，这差异应该是最小的。希望问题很清楚:) :)

示例：

N=3, K=3

N=1 : 6, 16, 67
N=2 : 11,17,68
N=3 : 10, 15, 100

如果选择 16、17、15.. 我们得到的最小差异为 17-15=2。

【问题讨论】：

你实际上并没有问任何问题......我很惊讶每个人都认为你想要大 O 符号。 3年过去了，我还是不明白我投票决定将此问题作为题外话结束，因为此处没有可查看的代码，也没有具体问题。 【参考方案1】：

我能想到 O(N*K*N)(在 zivo 正确指出后编辑，现在不是一个好的解决方案:() 解决方案。 1. 取N个指针，初始指向N个数组的每个初始元素。

6, 16, 67
^ 
11,17,68
^
10, 15, 100
^

2。找出当前指针 O(k)（6 和 11）中的最高和最低元素，并找出它们之间的差异。(5) 3. 将该数组中指向最低元素的指针加1。

 6, 16, 67
    ^ 
 11,17,68
 ^
 10, 15, 100 (difference:5)
 ^

4。继续重复步骤 2 和 3 并存储最小差值。

 6, 16, 67
    ^ 
 11,17,68
 ^
 10,15,100 (difference:5)
    ^ 


 6, 16, 67
    ^ 
 11,17,68
    ^
 10,15,100 (difference:2)
    ^

以上将是所需的解决方案。

 6, 16, 67
    ^ 
 11,17,68
    ^
 10,15,100 (difference:84)
       ^ 

 6, 16, 67
        ^ 
 11,17,68
    ^
 10,15,100 (difference:83)
       ^

等等……

编辑：

它的复杂性可以通过使用堆来降低（正如 Uri 所建议的那样）。我想到了，但遇到了一个问题：每次从堆中提取一个元素时，都必须找出它的数组编号，以便为该数组增加相应的指针。 找到数组编号的有效方法绝对可以将复杂度降低到 O(K*N log(K*N))。一种天真的方法是使用这样的数据结构

Struct

    int element;
    int arraynumer;

并重建初始数据，如

 6|0,16|0,67|0

 11|1,17|1,68|1

 10|2,15|2,100|2

最初保留第一列的当前最大值并将指向的元素插入堆中。现在每次提取一个元素，就可以找到它的数组编号，数组中的指针递增，新指向的元素可以与当前的最大值进行比较，并可以相应地调整最大值指针。

【讨论】：

这实际上是 NKN 解决方案。如果我没有记错的话。因为你改变指针 n*k 次，每次你需要 O(N) 次操作也许你可以优化到 NKlog(N)，如果你把 N 个指针放在一个堆中，它会找到最小的元素。提取最小值和插入新元素都是 log(N) 操作。对于最大元素，您只需要保留哪个指针是最大值，并在每次前进指针时进行比较。很好的解决方案。有趣的解决方案。我们可以通过对一个数组使用二分搜索并在另一个数组上使用增量来进行优化吗？【参考方案2】：

所以这里有一个算法分两步解决这个问题：

第一步是将所有数组合并到一个排序后的数组中，如下所示：

combined_val[] - 包含所有数字 combine_ind[] - 保存这个数字最初属于哪个数组的索引

这一步可以在 O(K*N*log(N)) 中轻松完成，但我认为你也可以做得更好（也许不是，你可以查找合并排序的变体，因为它们的步骤与此类似）

现在是第二步：

只放代码而不是解释更容易，所以这里是伪代码：


int count[N] =  0 
int head = 0;
int diffcnt = 0;
// mindiff is initialized to overall maximum value - overall minimum value
int mindiff = combined_val[N * K - 1] - combined_val[0];
for (int i = 0; i &lt N * K; i++) 

  count[combined_ind[i]]++;

  if (count[combined_ind[i]] == 1) 
    // diffcnt counts how many arrays have at least one element between
    // indexes of "head" and "i". Once diffcnt reaches N it will stay N and
    // not increase anymore
    diffcnt++;
   else 
    while (count[combined_ind[head]] > 1) 
      // We try to move head index as forward as possible while keeping diffcnt constant.
      // i.e. if count[combined_ind[head]] is 1, then if we would move head forward
      // diffcnt would decrease, that is something we dont want to do.
      count[combined_ind[head]]--;
      head++;
    
  

  if (diffcnt == N) 
    // i.e. we got at least one element from all arrays
    if (combined_val[i] - combined_val[head] &lt mindiff) 
      mindiff = combined_val[i] - combined_val[head];
      // if you want to save actual numbers too, you can save this (i.e. i and head
      // and then extract data from that)

结果在mindiff中。

第二步的运行时间是O(N * K)。这是因为“head”索引最多只能移动 N*K 次。所以内部循环不会使这个二次元，它仍然是线性的。

所以总的算法运行时间是O(N * K * log(N))，但是这是因为合并步骤，如果你能想出更好的合并步骤，你可以把它降低到O(N * K )。

【讨论】：

嘿..您忘记添加用于mindiff初始化的sn-p..我也认为我理解但不是很清楚。您能否在代码中添加一个小注释，尤其是在 head++ mindiff 被初始化为最大可能值，即总体最大数量 - 总体最小数量。该算法的要点是找到最小值和最大值。索引“i”指向最大值。索引“head”指向最小值。每次我们增加“i”时，我们都会尝试将索引“head”移动得尽可能远，同时我们仍然保留每个数组中的至少一个元素。这是一个绝妙的解决方案。您对我的算法错误的评论已经到位。我不认为你能比 O(NKlog(N)) 做得更好，至少在最坏的情况下不会，至少不是'N'部分，否则，在特殊情况下如果 K=1，您将能够在比 N*Log(N) 更好的时间内对数组进行排序谢谢@Uri。您的观察对合并步骤很有意义。因此，在最坏的情况下合并数组确实需要 O(N * K * log(N))。 -1 @zviadm 如果合并所有数组，那么您可以比较来自同一数组的元素，这是错误的，需要从三个数组中挑选元素以得出最小差异。（除非我误解了你的解决方案。）【参考方案3】：

这个问题是针对经理的

您有 3 名开发人员 (N1)、3 名测试人员 (N2) 和 3 名 DBA (N3) 选择能够成功运行项目的分歧较小的团队。

int[n] result;// where result[i] keeps the element from bucket N_i

int[n] latest;//where latest[i] keeps the latest element visited from bucket N_i

Iterate elements in (N_1 + N_2 + N_3) in sorted order

    Keep track of latest element visited from each bucket N_i by updating 'latest' array;

    if boundary(latest) < boundary(result)
    
       result = latest;
    


int boundary(int[] array)

   return Max(array) - Min(array);

【讨论】：

【参考方案4】：

我有 O(K*N*log(K))，典型的执行要少得多。目前想不出更好的办法。我将首先解释更容易描述（执行时间稍长）：

对于第一个数组中的每个元素 f（循环 K 个元素）对于每个数组，从第二个数组开始（循环遍历 N-1 个数组）对数组进行二分搜索，找到最接近 f 的元素。这是你的元素 (Log(K))

这个算法可以优化，如果为每个数组添加一个新的楼层索引。执行二分查找时，在“Floor”到“K-1”之间进行搜索。最初 Floor 索引为 0，对于第一个元素，您搜索整个数组。找到最接近“f”的元素后，使用该元素的索引更新楼层索引。最坏的情况相同（如果第一个数组的最大元素小于任何其他最小值，则楼层可能不会更新），但平均情况会有所改善。

【讨论】：

嘿不错的解决方案，我正在考虑使用最小堆和最大堆的解决方案..对此有任何建议此算法不完整或不正确。假设您从 N=1 中选择了 10。（第一个数组）N=2: 1 15 N=3: 5 16 您将选择 15 和 5，因此您的差值为 10，但如果您选择 15 和 16，您的差值为 6。 @zviadm 在查看问题后约 10 秒内我想出了相同的解决方案，我立即开始怀疑它是否正确......不过，我认为它正朝着正确的方向前进解决方案是正确的。只找到目前找到的最小元素的 ceil 和 floor。并保持接近最小值的那个。 1.从第一个数组中取出第一个元素并将其视为最小值。 2. 在第二个数组中找到最小值的ceil 和 floor，并选择最接近最小值的那个。 3.更新最小值 4.在后续行中搜索。 5.对 row1 的所有元素执行此操作【参考方案5】：

接受答案的正确性证明（终端的解决方案）

假设算法找到了一个序列 A= 这不是最优解 (R)。

考虑 R 中的索引 j，这样项目 R[j] 是 R 中算法检查的第一个项目，并将其替换为该行中的下一个项目。

让 A' 表示该阶段（替换之前）的候选解决方案。由于 R[j]=A'[j] 是 A' 的最小值，所以它也是 R 的最小值。现在，考虑 R 的最大值，R[m]。如果 A'[m]

【讨论】：

【参考方案6】：

对于第一个数组中的每个元素

    choose the element in 2nd array that is closest to the element in 1st array
    current_array = 2;
    do
    
        choose the element in current_array+1 that is closest to the element in current_array
        current_array++;
     while(current_array < n);

复杂度：O(k^2*n)

【讨论】：

【参考方案7】：

这是我关于如何解决此问题的逻辑，请记住，我们需要从 N 个数组中的每个数组中选择一个元素（以计算最小最小值）

// if we take the above values as an example!
// then the idea would be to sort all three arrays while keeping another
// array to keep the reference to their sets (1 or 2 or 3, could be 
// extended to n sets)      
1   3   2   3   1   2   1   2   3    // this is the array that holds the set index
6   10  11  15  16  17  67  68  100  // this is the sorted combined array.
           |           |   
    5            2          33       // this is the computed least minimum,
                                     // the rule is to make sure the indexes of the values 
                                     // we are comparing are different (to make sure we are 
// comparing elements from different sets), then for example
// the first element of that example is index:1|value:6 we hold 
// that value 6 (that is the value we will be using to compute the least minimum, 
// then we go to the edge of the comparison which would be the second different index, 
// we skip index:3|value:10 (we remove it from the array) we compare index:2|value:11 
// to index:1|value:6 we obtain 5 which would go to a variable named leastMinimum = 5, 
// now we remove the indexes and values we already used,
// and redo the same steps.

第 1 步：

1   3   2   3   1   2   1   2   3
6   10  11  15  16  17  67  68  100
           |   
5            
leastMinumum = 5

第 2 步：

3   1   2   1   2   3
15  16  17  67  68  100
           |   
 2          
leastMinimum = min(2, leastMinumum) // which is equal 2

第 3 步：

1   2   3
67  68  100

    33
leastMinimum = min(33, leastMinumum) // which is equal to old leastMinumum which is 2

现在：我们假设我们有来自同一个数组的元素彼此非常接近（这次 k=2，这意味着我们只有 3 个具有两个值的集合）：

// After sorting the n arrays we will have the below indexes array and values array
1   1   2   3   2   3
6   7   8   12  15  16
*       *   *

* we skip second index of 1|7 and we take the least minimum of 1|6 and 3|12 (index:2|value:8 will be removed as it is not at the edges, we pick the minimum and maximum of the unique index subset of n elements)
1   3         
6   12
 =6
* second step we remove the values we already used, so the array become like below:

1   2   3
7   15  16
*   *   * 
7 - 16
= 9

注意： 另一种消耗更多内存的方法是创建 N 个子数组，我们将从这些子数组中比较最大值 - 最小值

所以从下面的排序值数组及其对应的索引数组中，我们提取了另外三个子数组：

1   3   2   3   1   2   1   2   3
6   10  11  15  16  17  67  68  100

第一个数组：

1   3   2 
6   10  11

11-6 = 5

第二个数组：

3   1   2
15  15  17

17-15 = 2

第三个数组：

1   2   3
67  68  100

100 - 67 = 33

【讨论】：

以上是关于亚马逊面试问题的主要内容，如果未能解决你的问题，请参考以下文章