在 N×N 二进制矩阵中查找仅包含零的最大矩形

Posted

技术标签:

【中文标题】在 N×N 二进制矩阵中查找仅包含零的最大矩形【英文标题】:Find largest rectangle containing only zeros in an N×N binary matrix 【发布时间】:2021-11-03 16:25:46 【问题描述】:

给定一个 NxN 二进制矩阵(仅包含 0 或 1),我们如何才能找到包含所有 0 的最大矩形?

例子:

      I
    0 0 0 0 1 0
    0 0 1 0 0 1
II->0 0 0 0 0 0
    1 0 0 0 0 0
    0 0 0 0 0 1 <--IV
    0 0 1 0 0 0
            IV 

对于上面的例子,它是一个 6×6 的二进制矩阵。在这种情况下,返回值将是 Cell 1:(2, 1) 和 Cell 2:(4, 4)。生成的子矩阵可以是正方形或矩形。返回值也可以是全 0 的最大子矩阵的大小,在本例中为 3 × 4。

【问题讨论】:

请考虑将接受的答案更改为 J.F. Sebastian 的答案,该答案现在是正确的并且具有最佳复杂性。 请检查非常相似(我会说重复)的问题:***.com/questions/7770945/…,***.com/a/7353193/684229。解决方案是O(n) 我正在尝试对任何方向的矩形做同样的事情。见问题:***.com/questions/22604043/… @TMS 实际上恰恰相反。这些问题与这个问题重复。 【参考方案1】:

这是一个基于 @j_random_hacker 在 cmets 中建议的 "Largest Rectangle in a Histogram" problem 的解决方案:

[算法] 通过迭代来工作 从上到下的行,对于每一行 解决this problem,其中 “柱状图”中的“条形图”包括 所有完整的零向上轨迹 从当前行开始(a 如果列的高度为 1,则列的高度为 0 当前行)。

输入矩阵mat 可以是任意可迭代的,例如文件或网络流。一次只需要一行可用。

#!/usr/bin/env python
from collections import namedtuple
from operator import mul

Info = namedtuple('Info', 'start height')

def max_size(mat, value=0):
    """Find height, width of the largest rectangle containing all `value`'s."""
    it = iter(mat)
    hist = [(el==value) for el in next(it, [])]
    max_size = max_rectangle_size(hist)
    for row in it:
        hist = [(1+h) if el == value else 0 for h, el in zip(hist, row)]
        max_size = max(max_size, max_rectangle_size(hist), key=area)
    return max_size

def max_rectangle_size(histogram):
    """Find height, width of the largest rectangle that fits entirely under
    the histogram.
    """
    stack = []
    top = lambda: stack[-1]
    max_size = (0, 0) # height, width of the largest rectangle
    pos = 0 # current position in the histogram
    for pos, height in enumerate(histogram):
        start = pos # position where rectangle starts
        while True:
            if not stack or height > top().height:
                stack.append(Info(start, height)) # push
            elif stack and height < top().height:
                max_size = max(max_size, (top().height, (pos - top().start)),
                               key=area)
                start, _ = stack.pop()
                continue
            break # height == top().height goes here

    pos += 1
    for start, height in stack:
        max_size = max(max_size, (height, (pos - start)), key=area)    
    return max_size

def area(size):
    return reduce(mul, size)

解决方案是O(N),其中N 是矩阵中元素的数量。它需要O(ncols) 额外的内存,其中ncols 是矩阵中的列数。

带有测试的最新版本位于https://gist.github.com/776423

【讨论】:

不错的尝试,但是失败了max_size([[0,0,0,0,1,1,1], [0,0,0,0,0,0,0], [0,0,0,1,1,1,1], [0,0,1,1,1,1,1]] + [[1,0,1,1,1,1,1]] * 3),当左上角有一个 3x3 的正方形时返回 (2, 4)。 基本问题是,仅跟踪(几个)相邻点的最大面积矩形并不总是足够的,就像您在这里所做的那样。我知道唯一正确的 O(N) 算法通过从上到下遍历行来工作,对于解决此问题的每一行:***.com/questions/4311694/…,其中“直方图”中的“条形图”由所有不间断的向上轨迹组成从当前行开始的零(如果在当前行中有 1,则列的高度为 0)。 @j_random_hacker:我已经更新了我的答案以使用基于“直方图”的算法。 这看起来很棒,但是,我正在尝试实际找到最大的矩形(如返回坐标)。该算法将可靠地返回该区域,但是一旦我知道,一个人如何发现一个 3 列 x 2 行矩形的位置,其左上角位于 [3, 5](例如)?跨度> 从哪里获得边界列信息? (矩形的左列还是右列?)。我们可以从max_rectangle_size 获取宽度和高度,从for row in it: 迭代中获取底部行,但是我找不到边界列信息。【参考方案2】:

请查看Maximize the rectangular area under Histogram,然后继续阅读下面的解决方案。

Traverse the matrix once and store the following;

For x=1 to N and y=1 to N    
F[x][y] = 1 + F[x][y-1] if A[x][y] is 0 , else 0

Then for each row for x=N to 1 
We have F[x] -> array with heights of the histograms with base at x.
Use O(N) algorithm to find the largest area of rectangle in this histogram = H[x]

From all areas computed, report the largest.

时间复杂度为 O(N*N) = O(N²)(对于 NxN 二进制矩阵)

例子:

Initial array    F[x][y] array
 0 0 0 0 1 0     1 1 1 1 0 1
 0 0 1 0 0 1     2 2 0 2 1 0
 0 0 0 0 0 0     3 3 1 3 2 1
 1 0 0 0 0 0     0 4 2 4 3 2
 0 0 0 0 0 1     1 5 3 5 4 0
 0 0 1 0 0 0     2 6 0 6 5 1

 For x = N to 1
 H[6] = 2 6 0 6 5 1 -> 10 (5*2)
 H[5] = 1 5 3 5 4 0 -> 12 (3*4)
 H[4] = 0 4 2 4 3 2 -> 10 (2*5)
 H[3] = 3 3 1 3 2 1 -> 6 (3*2)
 H[2] = 2 2 0 2 1 0 -> 4 (2*2)
 H[1] = 1 1 1 1 0 1 -> 4 (1*4)

 The largest area is thus H[5] = 12

【讨论】:

很好的例子解释 你确定这是 O(N*N) 吗?整个矩阵有两次传递,但我的印象是这是 O(N)。 很好的解释.. :) 我希望你也能解释“最大化直方图下的矩形区域”.. :D 为了更清楚。解决方案是 O(N*N),其中 N 是行/列中的项目数,因为问题表明输入的大小为 NxN。如果 N 是输入中的项目总数,那么它是 O(N)【参考方案3】:

这是一个Python3的解决方案,除了最大矩形的面积之外,还返回位置:

#!/usr/bin/env python3

import numpy

s = '''0 0 0 0 1 0
0 0 1 0 0 1
0 0 0 0 0 0
1 0 0 0 0 0
0 0 0 0 0 1
0 0 1 0 0 0'''

nrows = 6
ncols = 6
skip = 1
area_max = (0, [])

a = numpy.fromstring(s, dtype=int, sep=' ').reshape(nrows, ncols)
w = numpy.zeros(dtype=int, shape=a.shape)
h = numpy.zeros(dtype=int, shape=a.shape)
for r in range(nrows):
    for c in range(ncols):
        if a[r][c] == skip:
            continue
        if r == 0:
            h[r][c] = 1
        else:
            h[r][c] = h[r-1][c]+1
        if c == 0:
            w[r][c] = 1
        else:
            w[r][c] = w[r][c-1]+1
        minw = w[r][c]
        for dh in range(h[r][c]):
            minw = min(minw, w[r-dh][c])
            area = (dh+1)*minw
            if area > area_max[0]:
                area_max = (area, [(r-dh, c-minw+1, r, c)])

print('area', area_max[0])
for t in area_max[1]:
    print('Cell 1:(, ) and Cell 2:(, )'.format(*t))

输出:

area 12
Cell 1:(2, 1) and Cell 2:(4, 4)

【讨论】:

效果很好!我从中制作了一个 Fortran 版本并编译它以在 Python 中使用,因为像这样在 Python 中遍历一个大数组非常慢。【参考方案4】:

这是 J.F. Sebastians 方法翻译成 C#:

private Vector2 MaxRectSize(int[] histogram) 
        Vector2 maxSize = Vector2.zero;
        int maxArea = 0;
        Stack<Vector2> stack = new Stack<Vector2>();

        int x = 0;
        for (x = 0; x < histogram.Length; x++) 
            int start = x;
            int height = histogram[x];
            while (true) 
                if (stack.Count == 0 || height > stack.Peek().y) 
                    stack.Push(new Vector2(start, height));

                 else if(height < stack.Peek().y) 
                    int tempArea = (int)(stack.Peek().y * (x - stack.Peek().x));
                    if(tempArea > maxArea) 
                        maxSize = new Vector2(stack.Peek().y, (x - stack.Peek().x));
                        maxArea = tempArea;
                    

                    Vector2 popped = stack.Pop();
                    start = (int)popped.x;
                    continue;
                

                break;
            
        

        foreach (Vector2 data in stack) 
            int tempArea = (int)(data.y * (x - data.x));
            if(tempArea > maxArea) 
                maxSize = new Vector2(data.y, (x - data.x));
                maxArea = tempArea;
            
        

        return maxSize;
    

    public Vector2 GetMaximumFreeSpace() 
        // STEP 1:
        // build a seed histogram using the first row of grid points
        // example: [true, true, false, true] = [1,1,0,1]
        int[] hist = new int[gridSizeY];
        for (int y = 0; y < gridSizeY; y++) 
            if(!invalidPoints[0, y]) 
                hist[y] = 1;
            
        

        // STEP 2:
        // get a starting max area from the seed histogram we created above.
        // using the example from above, this value would be [1, 1], as the only valid area is a single point.
        // another example for [0,0,0,1,0,0] would be [1, 3], because the largest area of contiguous free space is 3.
        // Note that at this step, the heigh fo the found rectangle will always be 1 because we are operating on
        // a single row of data.
        Vector2 maxSize = MaxRectSize(hist);
        int maxArea = (int)(maxSize.x * maxSize.y);

        // STEP 3:
        // build histograms for each additional row, re-testing for new possible max rectangluar areas
        for (int x = 1; x < gridSizeX; x++) 
            // build a new histogram for this row. the values of this row are
            // 0 if the current grid point is occupied; otherwise, it is 1 + the value
            // of the previously found historgram value for the previous position. 
            // What this does is effectly keep track of the height of continous avilable spaces.
            // EXAMPLE:
            //      Given the following grid data (where 1 means occupied, and 0 means free; for clairty):
            //          INPUT:        OUTPUT:
            //      1.) [0,0,1,0]   = [1,1,0,1]
            //      2.) [0,0,1,0]   = [2,2,0,2]
            //      3.) [1,1,0,1]   = [0,0,1,0]
            //
            //  As such, you'll notice position 1,0 (row 1, column 0) is 2, because this is the height of contiguous
            //  free space.
            for (int y = 0; y < gridSizeY; y++)                 
                if(!invalidPoints[x, y]) 
                    hist[y] = 1 + hist[y];
                 else 
                    hist[y] = 0;
                
            

            // find the maximum size of the current histogram. If it happens to be larger
            // that the currently recorded max size, then it is the new max size.
            Vector2 maxSizeTemp = MaxRectSize(hist);
            int tempArea = (int)(maxSizeTemp.x * maxSizeTemp.y);
            if (tempArea > maxArea) 
                maxSize = maxSizeTemp;
                maxArea = tempArea;
            
        

        // at this point, we know the max size
        return maxSize;            
    

有几点需要注意:

    此版本适用于 Unity API。您可以通过将 Vector2 的实例替换为 KeyValuePair 来轻松地使其更通用。 Vector2 仅用于存储两个值的便捷方式。 invalidPoints[] 是一个 bool 数组,其中 true 表示网格点“正在使用”,false 表示未使用。

【讨论】:

【参考方案5】:

空间复杂度 O(columns) [也可以修改为 O(rows)] 和时间复杂度 O(rows*columns) 的解决方案

public int maximalRectangle(char[][] matrix) 
    int m = matrix.length;
    if (m == 0)
        return 0;
    int n = matrix[0].length;
    int maxArea = 0;
    int[] aux = new int[n];
    for (int i = 0; i < n; i++) 
        aux[i] = 0;
    
    for (int i = 0; i < m; i++) 
        for (int j = 0; j < n; j++) 
            aux[j] = matrix[i][j] - '0' + aux[j];
            maxArea = Math.max(maxArea, maxAreaHist(aux));
        
    
    return maxArea;


public int maxAreaHist(int[] heights) 
    int n = heights.length;
    Stack<Integer> stack = new Stack<Integer>();
    stack.push(0);
    int maxRect = heights[0];
    int top = 0;
    int leftSideArea = 0;
    int rightSideArea = heights[0];
    for (int i = 1; i < n; i++) 
        if (stack.isEmpty() || heights[i] >= heights[stack.peek()]) 
            stack.push(i);
         else 
            while (!stack.isEmpty() && heights[stack.peek()] > heights[i]) 
                top = stack.pop();
                rightSideArea = heights[top] * (i - top);
                leftSideArea = 0;
                if (!stack.isEmpty()) 
                    leftSideArea = heights[top] * (top - stack.peek() - 1);
                 else 
                    leftSideArea = heights[top] * top;
                
                maxRect = Math.max(maxRect, leftSideArea + rightSideArea);
            
            stack.push(i);
        
    
    while (!stack.isEmpty()) 
        top = stack.pop();
        rightSideArea = heights[top] * (n - top);
        leftSideArea = 0;
        if (!stack.isEmpty()) 
            leftSideArea = heights[top] * (top - stack.peek() - 1);
         else 
            leftSideArea = heights[top] * top;
        
        maxRect = Math.max(maxRect, leftSideArea + rightSideArea);
    
    return maxRect;

但是当我在 LeetCode 上尝试这个时,我得到了 Time Limite exceeded excpetion。有没有更简单的解决方案?

【讨论】:

简单易懂..谢谢!【参考方案6】:

我提出了一个 O(nxn) 方法。

首先,您可以列出所有最大的空矩形。空意味着它只覆盖 0。最大的空矩形是这样的,它不能在不覆盖(至少)一个 1 的情况下在一个方向上扩展。

可以在www.ulg.ac.be/telecom/rectangles 找到一篇介绍 O(nxn) 算法来创建此类列表的论文以及源代码(未优化)。不需要存储列表,每次算法找到一个矩形时调用一个回调函数就足够了,并且只存储最大的一个(或者如果需要,可以选择另一个标准)。

请注意,存在一个证据(参见论文),即最大空矩形的数量受图像像素数的限制(在本例中为 nxn)。

因此,选择最优矩形可以在O(nxn)内完成,整体方法也是O(nxn)。

在实践中,这种方法速度很快,用于实时视频流分析。

【讨论】:

【参考方案7】:

这里是jfs的解决方案的一个版本,它也传递了最大矩形的位置:

from collections import namedtuple
from operator import mul

Info = namedtuple('Info', 'start height')

def max_rect(mat, value=0):
    """returns (height, width, left_column, bottom_row) of the largest rectangle 
    containing all `value`'s.

    Example:
    [[0, 0, 0, 0, 0, 0, 0, 0, 3, 2],
     [0, 4, 0, 2, 4, 0, 0, 1, 0, 0],
     [1, 0, 1, 0, 0, 0, 3, 0, 0, 4],
     [0, 0, 0, 0, 4, 2, 0, 0, 0, 0],
     [0, 0, 0, 2, 0, 0, 0, 0, 0, 0],
     [4, 3, 0, 0, 1, 2, 0, 0, 0, 0],
     [3, 0, 0, 0, 2, 0, 0, 0, 0, 4],
     [0, 0, 0, 1, 0, 3, 2, 4, 3, 2],
     [0, 3, 0, 0, 0, 2, 0, 1, 0, 0]]
     gives: (3, 4, 6, 5)
    """
    it = iter(mat)
    hist = [(el==value) for el in next(it, [])]
    max_rect = max_rectangle_size(hist) + (0,)
    for irow,row in enumerate(it):
        hist = [(1+h) if el == value else 0 for h, el in zip(hist, row)]
        max_rect = max(max_rect, max_rectangle_size(hist) + (irow+1,), key=area)
        # irow+1, because we already used one row for initializing max_rect
    return max_rect

def max_rectangle_size(histogram):
    stack = []
    top = lambda: stack[-1]
    max_size = (0, 0, 0) # height, width and start position of the largest rectangle
    pos = 0 # current position in the histogram
    for pos, height in enumerate(histogram):
        start = pos # position where rectangle starts
        while True:
            if not stack or height > top().height:
                stack.append(Info(start, height)) # push
            elif stack and height < top().height:
                max_size = max(max_size, (top().height, (pos - top().start), top().start), key=area)
                start, _ = stack.pop()
                continue
            break # height == top().height goes here

    pos += 1
    for start, height in stack:
        max_size = max(max_size, (height, (pos - start), start), key=area)

    return max_size

def area(size):
    return size[0] * size[1]

【讨论】:

【参考方案8】:

为了完整起见,这里是输出矩形坐标的 C# 版本。 它基于 dmarra 的回答,但没有任何其他依赖项。 只有 bool GetPixel(int x, int y) 函数,当像素设置在坐标 x,y 时返回 true。

    public struct INTRECT
    
        public int Left, Right, Top, Bottom;

        public INTRECT(int aLeft, int aTop, int aRight, int aBottom)
        
            Left = aLeft;
            Top = aTop;
            Right = aRight;
            Bottom = aBottom;
        

        public int Width  get  return (Right - Left + 1);  

        public int Height  get  return (Bottom - Top + 1);  

        public bool IsEmpty  get  return Left == 0 && Right == 0 && Top == 0 && Bottom == 0;  

        public static bool operator ==(INTRECT lhs, INTRECT rhs)
        
            return lhs.Left == rhs.Left && lhs.Top == rhs.Top && lhs.Right == rhs.Right && lhs.Bottom == rhs.Bottom;
        

        public static bool operator !=(INTRECT lhs, INTRECT rhs)
        
            return !(lhs == rhs);
        

        public override bool Equals(Object obj)
        
            return obj is INTRECT && this == (INTRECT)obj;
        

        public bool Equals(INTRECT obj)
        
            return this == obj;
        

        public override int GetHashCode()
        
            return Left.GetHashCode() ^ Right.GetHashCode() ^ Top.GetHashCode() ^ Bottom.GetHashCode();
        
    

    public INTRECT GetMaximumFreeRectangle()
    
        int XEnd = 0;
        int YStart = 0;
        int MaxRectTop = 0;
        INTRECT MaxRect = new INTRECT();
        // STEP 1:
        // build a seed histogram using the first row of grid points
        // example: [true, true, false, true] = [1,1,0,1]
        int[] hist = new int[Height];
        for (int y = 0; y < Height; y++)
        
            if (!GetPixel(0, y))
            
                hist[y] = 1;
            
        

        // STEP 2:
        // get a starting max area from the seed histogram we created above.
        // using the example from above, this value would be [1, 1], as the only valid area is a single point.
        // another example for [0,0,0,1,0,0] would be [1, 3], because the largest area of contiguous free space is 3.
        // Note that at this step, the heigh fo the found rectangle will always be 1 because we are operating on
        // a single row of data.
        Tuple<int, int> maxSize = MaxRectSize(hist, out YStart);
        int maxArea = (int)(maxSize.Item1 * maxSize.Item2);
        MaxRectTop = YStart;
        // STEP 3:
        // build histograms for each additional row, re-testing for new possible max rectangluar areas
        for (int x = 1; x < Width; x++)
        
            // build a new histogram for this row. the values of this row are
            // 0 if the current grid point is occupied; otherwise, it is 1 + the value
            // of the previously found historgram value for the previous position. 
            // What this does is effectly keep track of the height of continous avilable spaces.
            // EXAMPLE:
            //      Given the following grid data (where 1 means occupied, and 0 means free; for clairty):
            //          INPUT:        OUTPUT:
            //      1.) [0,0,1,0]   = [1,1,0,1]
            //      2.) [0,0,1,0]   = [2,2,0,2]
            //      3.) [1,1,0,1]   = [0,0,1,0]
            //
            //  As such, you'll notice position 1,0 (row 1, column 0) is 2, because this is the height of contiguous
            //  free space.
            for (int y = 0; y < Height; y++)
            
                if (!GetPixel(x, y))
                
                    hist[y]++;
                
                else
                
                    hist[y] = 0;
                
            

            // find the maximum size of the current histogram. If it happens to be larger
            // that the currently recorded max size, then it is the new max size.
            Tuple<int, int> maxSizeTemp = MaxRectSize(hist, out YStart);
            int tempArea = (int)(maxSizeTemp.Item1 * maxSizeTemp.Item2);
            if (tempArea > maxArea)
            
                maxSize = maxSizeTemp;
                maxArea = tempArea;
                MaxRectTop = YStart;
                XEnd = x;
            
        
        MaxRect.Left = XEnd - maxSize.Item1 + 1;
        MaxRect.Top = MaxRectTop;
        MaxRect.Right = XEnd;
        MaxRect.Bottom = MaxRectTop + maxSize.Item2 - 1;

        // at this point, we know the max size
        return MaxRect;
    

    private Tuple<int, int> MaxRectSize(int[] histogram, out int YStart)
    
        Tuple<int, int> maxSize = new Tuple<int, int>(0, 0);
        int maxArea = 0;
        Stack<Tuple<int, int>> stack = new Stack<Tuple<int, int>>();
        int x = 0;
        YStart = 0;
        for (x = 0; x < histogram.Length; x++)
        
            int start = x;
            int height = histogram[x];
            while (true)
            
                if (stack.Count == 0 || height > stack.Peek().Item2)
                
                    stack.Push(new Tuple<int, int>(start, height));
                
                else if (height < stack.Peek().Item2)
                
                    int tempArea = (int)(stack.Peek().Item2 * (x - stack.Peek().Item1));
                    if (tempArea > maxArea)
                    
                        YStart = stack.Peek().Item1;
                        maxSize = new Tuple<int, int>(stack.Peek().Item2, (x - stack.Peek().Item1));
                        maxArea = tempArea;
                    
                    Tuple<int, int> popped = stack.Pop();
                    start = (int)popped.Item1;
                    continue;
                
                break;
            
        

        foreach (Tuple<int, int> data in stack)
        
            int tempArea = (int)(data.Item2 * (x - data.Item1));
            if (tempArea > maxArea)
            
                YStart = data.Item1;
                maxSize = new Tuple<int, int>(data.Item2, (x - data.Item1));
                maxArea = tempArea;
            
        

        return maxSize;
    

【讨论】:

以上是关于在 N×N 二进制矩阵中查找仅包含零的最大矩形的主要内容,如果未能解决你的问题,请参考以下文章

华为机试真题 C++ 实现矩阵最大值

85. 最大矩形

85. 最大矩形

85. 最大矩形

85. 最大矩形

LeetCode(85):最大矩形