Python 和 OpenCV：如何裁剪半成形的边界框

Posted 2023-03-28

技术标签:

【中文标题】Python 和 OpenCV：如何裁剪半成形的边界框【英文标题】：Python & OpenCV: How to crop half-formed bounding boxes 【发布时间】：2021-07-12 23:24:54 【问题描述】：

我有一个为无网格表创建网格线的脚本：

脚本之前：

脚本之后：

有没有一种简单的方法，使用 OpenCV 来裁剪“脚本后”图像，使其仅包含四边边界框？示例输出：

编辑：

我目前正在研究一种解决方案，该解决方案可以找到垂直/水平方向的第一条/最后一条全黑像素线。它会工作，但想知道是否有更优雅的东西。

【问题讨论】：

【参考方案1】：

这是在 Python/OpenCV 中执行此操作的一种方法，方法是从除最大轮廓之外的所有轮廓中获取最小和最大 x 和 y。

输入：

import cv2
import numpy as np

# read image
img = cv2.imread('test_table.png')
hh, ww = img.shape[:2]

# convert to gray
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# threshold
thresh = cv2.threshold(gray, 128, 255, cv2.THRESH_BINARY)[1]

# crop 1 pixel and add 1 pixel white border to ensure outer white regions not considered small contours
thresh = thresh[1:hh-1, 1:ww-1]
thresh = cv2.copyMakeBorder(thresh, 1,1,1,1, borderType=cv2.BORDER_CONSTANT, value=(255,255,255))

# get contours
contours = cv2.findContours(thresh, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
contours = contours[0] if len(contours) == 2 else contours[1]
big_contour = max(contours, key=cv2.contourArea)

# get min and max x and y from all bounding boxes larger than half the image size
area_thresh = hh * ww / 2
xmin = ww
ymin = hh
xmax = 0
ymax = 0

for cntr in contours:
    area = cv2.contourArea(cntr)
    if area < area_thresh:
        x,y,w,h = cv2.boundingRect(cntr)
        xmin = x if (x < xmin) else xmin
        ymin = y if (y < ymin) else ymin
        xmax = x+w-1 if (x+w-1 > xmax ) else xmax
        ymax = y+h-1 if (y+h-1 > ymax) else ymax


# draw bounding box     
bbox = img.copy()
cv2.rectangle(bbox, (xmin, ymin), (xmax, ymax), (0, 0, 255), 2)

# crop img at bounding box, but add 2 all around to keep the black lines
result = img[ymin-3:ymax+3, xmin-3:xmax+3]

# save results
cv2.imwrite('test_table_bbox.png',bbox)
cv2.imwrite('test_table_trimmed.png',result)

# show results
cv2.imshow("thresh", thresh)
cv2.imshow("bbox", bbox)
cv2.imshow("result", result)
cv2.waitKey(0)
cv2.destroyAllWindows()

输入上所有边界框的边界框：

修剪后的图像：

【讨论】：

【参考方案2】：

注意：我知道已经有一个公认的答案，但我想提供一个更简单的版本。

基本上，首先找到图像中每个形状的轮廓（每个单元格），其面积大于将过滤掉任何噪声的所选数字。

遍历轮廓，找到最小和最大的 x 和 y 坐标。有了这4个点，我们就可以将图像中四个坐标内的像素保存到一个单独的数组中，用白色填充原始图像，然后将表格重新绘制到图像上。

代码：

import cv2

img = cv2.imread("table.png")
h, w, _ = img.shape

x1, y1 = w, h
x2, y2 = 0, 0

contours, _ = cv2.findContours(cv2.Canny(img, 0, 0), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)

for cnt in contours:
    if cv2.contourArea(cnt) > 1000:
        x1 = min(cnt[..., 0].min(), x1)
        y1 = min(cnt[..., 1].min(), y1)
        x2 = max(cnt[..., 0].max(), x2)
        y2 = max(cnt[..., 1].max(), y2)

pad = 2
x1 -= pad
y1 -= pad
x2 += pad * 2
y2 += pad * 2

table = img[y1:y2, x1:x2].copy()

img.fill(255)
img[y1:y2, x1:x2] = table
cv2.imshow("lined_table.png", img)
cv2.waitKey(0)

输出：

解释：

导入opencv模块并读入镜像。获取图像的尺寸并为表格的第一个角和表格的最后一个角定义临时坐标：

import cv2

img = cv2.imread("table.png")
h, w, _ = img.shape

x1, y1 = w, h
x2, y2 = 0, 0

获取图像的轮廓，循环遍历每个轮廓，过滤掉面积小于1000的轮廓：

contours, _ = cv2.findContours(cv2.Canny(img, 0, 0), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)

for cnt in contours:
    if cv2.contourArea(cnt) > 999:

更新表格第一个角和最后一个角的坐标值：

        x1 = min(cnt[..., 0].min(), x1)
        y1 = min(cnt[..., 1].min(), y1)
        x2 = max(cnt[..., 0].max(), x2)
        y2 = max(cnt[..., 1].max(), y2)

根据线的宽度在每个坐标周围应用一个填充：

pad = 2
x1 -= pad
y1 -= pad
x2 += pad * 2
y2 += pad * 2

根据找到的x和y坐标将图像的一部分复制到一个变量中，清空图像，然后在图像上重新绘制表格。最后，展示图片：

table = img[y_1:y_2, x_1:x_2].copy()

img.fill(255)
img[y_1:y_2, x_1:x_2] = table
cv2.imwrite("lined_table.png", img)
cv2.waitKey(0)

【讨论】：

以上是关于Python 和 OpenCV：如何裁剪半成形的边界框的主要内容，如果未能解决你的问题，请参考以下文章