带有opencv的手写数字边界框

Posted 2023-04-17

技术标签:

【中文标题】带有opencv的手写数字边界框【英文标题】：bounding boxes on handwritten digits with opencv 【发布时间】：2021-08-14 04:38:15 【问题描述】：

我尝试使用下面提供的代码来分割此图像中的每个数字并在其周围放置一个轮廓，然后将其裁剪掉，但这给我带来了不好的结果，我不确定我需要更改或处理什么。

我现在能想到的最好办法是过滤图像中除了图像轮廓本身之外的 4 个最大轮廓。

我正在使用的代码：

import sys
import numpy as np
import cv2

im = cv2.imread('marks/mark28.png')
im3 = im.copy()

gray = cv2.cvtColor(im, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (5, 5), 0)
thresh = cv2.adaptiveThreshold(blur, 255, 1, 1, 11, 2)

#################      Now finding Contours         ###################

contours, hierarchy = cv2.findContours(thresh, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)

samples = np.empty((0, 100))
responses = []
keys = [i for i in range(48, 58)]

for cnt in contours:
    if cv2.contourArea(cnt) > 50:
        [x, y, w, h] = cv2.boundingRect(cnt)
    
        if h > 28:
            cv2.rectangle(im, (x, y), (x + w, y + h), (0, 0, 255), 2)
            roi = thresh[y:y + h, x:x + w]
            roismall = cv2.resize(roi, (10, 10))
            cv2.imshow('norm', im)
            key = cv2.waitKey(0)

            if key == 27:  # (escape to quit)
                sys.exit()
            elif key in keys:
                responses.append(int(chr(key)))
                sample = roismall.reshape((1, 100))
                samples = np.append(samples, sample, 0)

    responses = np.array(responses, np.float32)
    responses = responses.reshape((responses.size, 1))
    print
    "training complete"

    np.savetxt('generalsamples.data', samples)
    np.savetxt('generalresponses.data', responses)

我可能需要更改高度的 if 条件，但更重要的是我需要 if 条件来获得图像上的 4 个最大轮廓。遗憾的是，我还没有找到我应该过滤的内容。

This is the kind of results 我明白了，我正试图避免将那些内部轮廓放在数字“零”上

请求的未处理图像：example 1example 2

我只需要知道我应该过滤什么，请不要编写代码。感谢社区。p>

【问题讨论】：

请张贴未经处理的原始图片。 @stateMachine 我按要求添加了一些原始的干净图像示例。我希望它有所帮助。 【参考方案1】：

你几乎拥有它。每个数字都有多个边界矩形，因为您正在检索每个轮廓（外部和内部）。您在RETR_LIST 模式下使用cv2.findContours，它检索所有轮廓，但不创建任何父子关系。父子关系是区分内部（子）和外部（父）轮廓的原因，OpenCV 将此称为“轮廓层次结构”。查看docs 了解所有层次模式的概述。特别感兴趣的是RETR_EXTERNAL 模式。此模式仅获取外部轮廓 - 因此您不会为每个数字获取多个轮廓和（通过扩展）多个边界框！

另外，您的图片似乎有红色边框。这将在对图像进行阈值处理时引入噪声，并且此边框可能会被识别为 ***外轮廓 - 因此，不会在@987654330 中获取所有其他轮廓（此父轮廓的子轮廓） @ 模式。幸运的是，边框位置似乎是恒定的，我们可以使用简单的flood-fill 消除它，它几乎可以用替代颜色填充目标颜色的斑点。

让我们看看修改后的代码：

# Imports:
import cv2
import numpy as np

# Set image path
path = "D://opencvImages//"
fileName = "rhWM3.png"

# Read Input image
inputImage = cv2.imread(path+fileName)

# Deep copy for results:
inputImageCopy = inputImage.copy()

# Convert BGR to grayscale:
grayscaleImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2GRAY)

# Threshold via Otsu:
threshValue, binaryImage = cv2.threshold(grayscaleImage, 0, 255, cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)

第一步是获取包含所有目标斑点/轮廓的二进制图像。这是目前的结果：

注意边框是白色的。我们必须删除它，在位置(x=0,y=0)女巫黑色进行简单的填充就足够了：

# Flood-fill border, seed at (0,0) and use black (0) color:
cv2.floodFill(binaryImage, None, (0, 0), 0)

这是填充后的图像，没有边框了！

现在我们可以在RETR_EXTERNAL模式下检索外部的最外层轮廓：

# Get each bounding box
# Find the big contours/blobs on the filtered image:
contours, hierarchy = cv2.findContours(binaryImage, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)

请注意，您还会将每个轮廓的 hierarchy 作为第二个返回值。如果您想检查当前轮廓是 parent 还是 child，这很有用。好的，让我们遍历轮廓并获取它们的边界框。如果你想忽略低于最小面积阈值的轮廓，你也可以实现一个面积过滤器：

# Look for the outer bounding boxes (no children):
for _, c in enumerate(contours):

    # Get the bounding rectangle of the current contour:
    boundRect = cv2.boundingRect(c)

    # Get the bounding rectangle data:
    rectX = boundRect[0]
    rectY = boundRect[1]
    rectWidth = boundRect[2]
    rectHeight = boundRect[3]

    # Estimate the bounding rect area:
    rectArea = rectWidth * rectHeight

    # Set a min area threshold
    minArea = 10

    # Filter blobs by area:
    if rectArea > minArea:

        # Draw bounding box:
        color = (0, 255, 0)
        cv2.rectangle(inputImageCopy, (int(rectX), int(rectY)),
                      (int(rectX + rectWidth), int(rectY + rectHeight)), color, 2)
        cv2.imshow("Bounding Boxes", inputImageCopy)

        # Crop bounding box:
        currentCrop = inputImage[rectY:rectY+rectHeight,rectX:rectX+rectWidth]
        cv2.imshow("Current Crop", currentCrop)
        cv2.waitKey(0)

上面sn-p的最后三行裁剪并显示当前数字。这是检测到两个图像的边界框的结果（边界框以绿色着色，红色边框是输入图像的一部分）：

【讨论】：

非常感谢！！你不知道你刚才让我的夜晚有多少。我已经在不同批次的图像上对此进行了测试，并且工作正常。说真的，你是救生员 @MoudhafferBouallegui 很高兴我能帮上忙，伙计！

以上是关于带有opencv的手写数字边界框的主要内容，如果未能解决你的问题，请参考以下文章