如何使用 OpenCV 裁剪图像中的白色斑块并制作护照大小的照片

Posted

技术标签:

【中文标题】如何使用 OpenCV 裁剪图像中的白色斑块并制作护照大小的照片【英文标题】:How to crop white patches in image and make passport size photo using OpenCV 【发布时间】:2021-05-10 14:48:23 【问题描述】:

我是 OpenCV 的新手,我有一些图像需要裁剪成完美的护照尺寸照片。我有成千上万的图像需要像这样自动裁剪和拉直。如果图像太模糊且无法裁剪,我需要将其复制到被拒绝的文件夹中。我尝试使用haar cascade,但这种方法只给了我面子。但我需要一张带有照片裁剪背景的脸。谁能告诉我如何在 OpenCV 或任何代码中编写代码?

            gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
            faceCascade = cv2.CascadeClassifier(
                cv2.data.haarcascades + "haarcascade_frontalface_default.xml")
            faces = faceCascade.detectMultiScale(
                gray,
                scaleFactor=1.3,
                minNeighbors=3,
                minSize=(30, 30)
            )
            if(len(faces) == 1):
                for (x, y, w, h) in faces:
                    if(x-w < 100 and y-h < 100):
                        ystart = int(y-y*int(y1)/100)
                        xstart = int(x-x*int(x1)/100)
                        yend = int(h+h*int(y1)/100)
                        xend = int(w+w*int(y2)/100)
                        roi_color = img[ystart:y + yend, xstart:x + xend]
                        cv2.imwrite(path, roi_color)

                    else:
                        rejectedCount += 1
                        cv2.imwrite(path, img)

之前

之后

【问题讨论】:

【参考方案1】:

我会按如下方式处理您的问题:

    首先我们需要抓住我们感兴趣的点 了解普通护照头像的大小(以像素为单位)

如何抓取兴趣点。

我们有更多方法:

    您可以使用windows 绘画应用程序 但为了更加程序化,我们可以使用cv2。我将向你展示如何使用 cv2 做到这一点。

另请注意,这不会产生高分辨率图像,您必须自己玩代码。

# imports 
import numpy as np
import cv2

width = height = 600 # normal passport photo size in pixels

# global variable that will update the points when we clicked on the image
pt1 = []
pt2 = np.float32([[0, 0], [height, 0], [0, width], [height, width]])
def mouseEvent(event, x, y, flags, param):
    if event == cv2.EVENT_LBUTTONDOWN:
        global pt1
        if len(pt1) == 4:
            pt1 = []
        else:
            pt1.append([x, y])

while 1:
    image = cv2.imread("img.jpg", cv2.IMREAD_UNCHANGED)
    cv2.imshow("Original Image", image)
    cv2.setMouseCallback("Original Image", mouseEvent)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break
    if len(pt1) == 4:
        break

然后我们使用两个cv2 函数,它们是getPerspectiveTransformwarpPerspectivegetPerspectiveTransform() 将接受两个点,我们的 pt1pt2 然后我们将调用 warpPerspective() 函数并传递三个位置参数,图像、矩阵和图像形状:

image = cv2.imread("img.jpg", 0)
matrix = cv2.getPerspectiveTransform(np.float32(pt1), pt2)
image = cv2.warpPerspective(image, matrix, image.shape)
cv2.imshow("Wrap Perspective", image)
cv2.waitKey(0)

我知道这不是一个好的解释,但你明白了。整个代码程序如下所示:


import numpy as np
import cv2

width = height = 600 # normal passport photo size in pixels
pt1 = []
pt2 = np.float32([[0, 0], [height, 0], [0, width], [height, width]])
def mouseEvent(event, x, y, flags, param):
    if event == cv2.EVENT_LBUTTONDOWN:
        global pt1
        if len(pt1) == 4:
            pt1 = []
        else:
            pt1.append([x, y])
while 1:
    image = cv2.imread("img.jpg", cv2.IMREAD_UNCHANGED)
    cv2.imshow("Original Image", image)
    cv2.setMouseCallback("Original Image", mouseEvent)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break
    if len(pt1) == 4:
        break

image = cv2.imread("img.jpg", 0)
matrix = cv2.getPerspectiveTransform(np.float32(pt1), pt2)
image = cv2.warpPerspective(image, matrix, image.shape)
cv2.imshow("Wrap Perspective", image)
cv2.waitKey(0)
    当您运行以下代码时,将显示一个图像。 要使用此程序,您必须从A-D 依次单击四个点。例如,如果这是您的图片:
------------------
| (a)          (b)|
|                 |
|                 |
|                 |
|                 |
|                 |
| (c)          (d)|
-------------------

其中 a、b、c 和 d 是您在图像crop 上感兴趣的点。

演示

点击1然后2然后3最后4得到上面的结果

【讨论】:

我有 10000 多张图片需要裁剪。我不认为使用鼠标手动执行它不是一个可行的想法。有没有办法自动检测点? 你可以使用人工智能,或者cascadeClassifiers,这是更好的方法 “使用 AI”是一个非答案(相当于“使用魔法”),级联分类器完全不适合挑选那些角点,因为这些角点会发生一些旋转。众所周知,当事物旋转时,级联分类器会失败。【参考方案2】:

这是在 Python/OpenCV 中通过键入图像周围的黑线来提取照片的一种方法。

输入:

 - Read the input
 - Pad the image with white so that the lines can be extended until intersection
 - Threshold on black to extract the lines
 - Apply morphology close to try to connect the lines somewhat
 - Get the contours and filter on area drawing the contours on a black background
 - Apply morphology close again to fill the line centers
 - Skeletonize to thin the lines
 - Get the Hough lines and draw them as white on a black background
 - Floodfill the center of the rectangle of lines to fill with mid-gray. Then convert that image to binary so that the gray becomes white and all else is black.
 - Get the coordinates of all non-black pixels and then from the coordinates get the rotated rectangle.
 - Use the angle and center of the rotated rectangle to unrotated both the padded image and this mask image via an Affine warp
 - (Alternately, get the four corners of the rotated rectangle from the mask and then project that to the padded input domain using the affine matrix)
- Get the coordinates of all non-black pixels in the unrotated mask and compute its rotated rectangle.
 - Get the bounding box of the (un-)rotated rectangle 
 - Use those bounds to crop the padded image
 - Save the results

import cv2
import numpy as np
import math
from skimage.morphology import skeletonize

# read image
img = cv2.imread('passport.jpg')
ht, wd = img.shape[:2]

# pad image with white by 20% on all sides
padpct = 20
xpad = int(wd*padpct/100)
ypad = int(ht*padpct/100)
imgpad = cv2.copyMakeBorder(img, ypad, ypad, xpad, xpad, borderType=cv2.BORDER_CONSTANT, value=(255,255,255))
ht2, wd2 = imgpad.shape[:2]

# threshold on black
low = (0,0,0)
high = (20,20,20)

# threshold
thresh = cv2.inRange(imgpad, low, high)

# apply morphology to connect the white lines
kernel = np.ones((5,5), np.uint8)
morph = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, kernel)

# get contours
contours = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
contours = contours[0] if len(contours) == 2 else contours[1]

# filter on area
mask = np.zeros((ht2,wd2), dtype=np.uint8)
for cntr in contours:
    area = cv2.contourArea(cntr)
    if area > 20:
        cv2.drawContours(mask, [cntr], 0, 255, 1)

# apply morphology to connect the white lines and divide by 255 to make image in range 0 to 1
kernel = np.ones((5,5), np.uint8)
bmask = cv2.morphologyEx(mask, cv2.MORPH_CLOSE, kernel)/255

# apply thinning (skeletonizing)
skeleton = skeletonize(bmask)
skeleton = (255*skeleton).clip(0,255).astype(np.uint8)

# get hough lines
line_img = np.zeros_like(imgpad, dtype=np.uint8)
lines= cv2.HoughLines(skeleton, 1, math.pi/180.0, 90, np.array([]), 0, 0)
a,b,c = lines.shape
for i in range(a):
    rho = lines[i][0][0]
    theta = lines[i][0][1]
    a = math.cos(theta)
    b = math.sin(theta)
    x0, y0 = a*rho, b*rho
    pt1 = ( int(x0+1000*(-b)), int(y0+1000*(a)) )
    pt2 = ( int(x0-1000*(-b)), int(y0-1000*(a)) )
    cv2.line(line_img, pt1, pt2, (255, 255, 255), 1)

# floodfill with mid-gray (128)
xcent = int(wd2/2)
ycent = int(ht2/2)
ffmask = np.zeros((ht2+2, wd2+2), np.uint8)
mask2 = line_img.copy()
mask2 = cv2.floodFill(mask2, ffmask, (xcent,ycent), (128,128,128))[1]

# convert mask2 to binary
mask2[mask2 != 128] = 0
mask2[mask2 == 128] = 255
mask2 = mask2[:,:,0]

# get coordinates of all non-zero pixels
# NOTE: must transpose since numpy coords are y,x and opencv uses x,y
coords = np.column_stack(np.where(mask2.transpose() > 0))

# get rotated rectangle from coords
rotrect = cv2.minAreaRect(coords)
(center), (width,height), angle = rotrect
# from https://www.pyimagesearch.com/2017/02/20/text-skew-correction-opencv-python/
# the `cv2.minAreaRect` function returns values in the
# range [-90, 0); as the rectangle rotates clockwise the
# returned angle trends to 0 -- in this special case we
# need to add 90 degrees to the angle
if angle < -45:
    angle = -(90 + angle)
 
# otherwise, just take the inverse of the angle to make
# it positive
else:
    angle = -angle

# compute correction rotation
rotation = -angle - 90

# compute rotation affine matrix
M = cv2.getRotationMatrix2D(center, rotation, scale=1.0)
    
# unrotate imgpad and mask2 using affine warp
rot_img = cv2.warpAffine(imgpad, M, (wd2, ht2), flags=cv2.INTER_CUBIC, borderValue=(0,0,0))
rot_mask2= cv2.warpAffine(mask2, M, (wd2, ht2), flags=cv2.INTER_CUBIC, borderValue=(0,0,0))

# get coordinates of all non-zero pixels
# NOTE: must transpose since numpy coords are y,x and opencv uses x,y
coords2 = np.column_stack(np.where(rot_mask2.transpose() > 0))

# get bounding box
x,y,w,h = cv2.boundingRect(coords2)
print(x,y,w,h)

# crop rot_img
result = rot_img[y:y+h, x:x+w]

# save resulting images
cv2.imwrite('passport_pad.jpg',imgpad)
cv2.imwrite('passport_thresh.jpg',thresh)
cv2.imwrite('passport_morph.jpg',morph)
cv2.imwrite('passport_mask.jpg',mask)
cv2.imwrite('passport_skeleton.jpg',skeleton)
cv2.imwrite('passport_line_img.jpg',line_img)
cv2.imwrite('passport_mask2.jpg',mask2)
cv2.imwrite('passport_rot_img.jpg',rot_img)
cv2.imwrite('passport_rot_mask2.jpg',rot_mask2)
cv2.imwrite('passport_result.jpg',result)

# show thresh and result    
cv2.imshow("imgpad", imgpad)
cv2.imshow("thresh", thresh)
cv2.imshow("morph", morph)
cv2.imshow("mask", mask)
cv2.imshow("skeleton", skeleton)
cv2.imshow("line_img", line_img)
cv2.imshow("mask2", mask2)
cv2.imshow("rot_img", rot_img)
cv2.imshow("rot_mask2", rot_mask2)
cv2.imshow("result", result)
cv2.waitKey(0)
cv2.destroyAllWindows()

填充图像:

阈值图像:

形态清洁图像:

面具1图片:

骨架图像:

(霍夫)线图像:

填充线条图像 - Mask2:

未旋转的填充图像:

未旋转的 Mask2 图像:

裁剪图像:

【讨论】:

感谢您的回答。我尝试了此代码,但它仅适用于此图像。它不适用于其他图像。我已经添加了另外两张有问题的图片,请检查。让我知道我需要在代码中做的更改。 很难让它适用于所有图像。在您的所有图像中,黑线都不够突出或不够暗。您在一张图像时也有多余的黑线。此外,图像被扭曲,因此黑线不是直的,因此霍夫线不会检测到单条线,而是每边检测到多条线。【参考方案3】:

如果所有照片周围都有细的白黑边框,你可以

    对图片设置阈值 获取所有轮廓和 选择那些轮廓 有正确的梯度 足够大 通过approxPolyDP时减少到4个角 获得一个定向边界框 构造仿射变换 应用仿射变换

如果这些照片不是扫描的,而是用相机从某个角度(不是自上而下)拍摄的,则需要使用根据角点本身计算的透视变换。 p>

如果照片不是平的而是翘曲的,那就是完全不同的问题了。

import numpy as np
import cv2 as cv

im = cv.imread("Zh8QV.jpg")
gray = cv.cvtColor(im, cv.COLOR_BGR2GRAY)

gray = 255 - gray # invert so findContours' implicit black border doesn't bother us

height, width = gray.shape
minarea = (height * width) * 0.20

# (th_level, thresholded) = cv.threshold(gray, thresh=128, maxval=255, type=cv.THRESH_OTSU)

# threshold relative to estimated brightness of "white"
th_level = 255 - (255 - np.median(gray)) * 0.98
(th_level, thresholded) = cv.threshold(gray, thresh=th_level, maxval=255, type=cv.THRESH_BINARY)

(contours, hierarchy) = cv.findContours(thresholded, mode=cv.RETR_LIST, method=cv.CHAIN_APPROX_SIMPLE)

# black-to-white contours have negative area...
#areas = sorted([cv.contourArea(c, oriented=True) for c in contours])

large_areas = [ c for c in contours if cv.contourArea(c, oriented=True) <= -minarea ]

quads = [
    c for c in large_areas
    if len(cv.approxPolyDP(c, epsilon=0.02 * cv.arcLength(c, True), closed=True)) == 4
]

# if there is no quad, or multiple, that's an error (for this example)
assert len(quads) == 1, quads
[quad] = quads

bbox = cv.minAreaRect(quad)
(bcenter, bsize, bangle) = bbox
bcenter = np.array(bcenter)
bsize = np.array(bsize)

# keep orientation upright, fix up bbox size
(rot90, bangle) = divmod(bangle + 45, 90)
bangle -= 45
if rot90 % 2 != 0:
    bsize = bsize[::-1]

# construct affine transformation
M1 = np.eye(3)
M1[0:2,2] = -bcenter

R = np.eye(3)
R[0:2] = cv.getRotationMatrix2D(center=(0,0), angle=bangle, scale=1.0)

M2 = np.eye(3)
M2[0:2,2] = +bsize * 0.5

M = M2 @ R @ M1

bwidth, bheight = np.ceil(bsize)
dsize = (int(bwidth), int(bheight))

output = cv.warpAffine(im, M[0:2], dsize=dsize, flags=cv.INTER_CUBIC)

cv.imshow("output", output)
cv.waitKey(-1)
cv.destroyWindow("output")

【讨论】:

兄弟代码没有显示输出图像imgur.com/zzQ731c 当我将整个代码放在 try-catch 块中时,它给了我一个错误。你能修复代码吗? 你能告诉我你用的是什么图片吗? 我用的是你用过的同一张图片 感谢您指出问题。它已修复,答案已更新。【参考方案4】:

我要做的是以下 3 个步骤(我不会为您编写代码,抱歉,如果您在其中一个阶段需要帮助,我很乐意为您提供帮助):

    使用Hough transform检测图片中最强的4条线。

    计算线的 4 个交点

    应用透视变换。

您应该拥有所需的裁剪图像。

【讨论】:

【参考方案5】:

概念

    处理每个图像以增强照片的边缘。

    通过首先找到面积最大的轮廓,获取其凸包并逼近凸包,直到只剩下4个点,来获取每个处理后图像的照片的4个角。

    根据检测到的 4 个角扭曲每个图像。

代码

import cv2
import numpy as np

def process(img):
    img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    img_blur = cv2.GaussianBlur(img_gray, (1, 1), 1)
    img_canny = cv2.Canny(img_blur, 350, 150)
    kernel = np.ones((3, 3))
    img_dilate = cv2.dilate(img_canny, kernel, iterations=2)
    return cv2.erode(img_dilate, kernel, iterations=1)

def get_pts(img):
    contours, _ = cv2.findContours(img, cv2.RETR_TREE, cv2.CHAIN_APPROX_NONE)
    cnt = max(contours, key=cv2.contourArea)
    peri = cv2.arcLength(cnt, True)
    return cv2.approxPolyDP(cv2.convexHull(cnt), 0.04 * peri, True)

files = ["1.jpg", "2.jpg", "3.jpg"]
width, height = 350, 450
pts2 = np.float32([[width, 0], [0, 0], [width, height], [0, height]])

for file in files:
    img = cv2.imread(file)
    pts1 = get_pts(process(img)).squeeze()
    pts1 = np.float32(pts1[np.lexsort(pts1.T)])
    matrix = cv2.getPerspectiveTransform(pts1, pts2)
    out = cv2.warpPerspective(img, matrix, (width, height))[5:-5, 5:-5]
    cv2.imshow(file, out)

cv2.waitKey(0)
cv2.destroyAllWindows()

输出

我将每个输出彼此相邻放置以适合一张图像:

解释

    导入必要的库:
import cv2
import numpy as np
    定义一个函数process(),它接收一个BGR图像数组并返回用Canny edge detector处理的图像,以便以后更准确地检测每张照片的边缘。如果需要,可以调整函数中使用的值以更适合其他图像:
def process(img):
    img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    img_blur = cv2.GaussianBlur(img_gray, (1, 1), 1)
    img_canny = cv2.Canny(img_blur, 350, 150)
    kernel = np.ones((3, 3))
    img_dilate = cv2.dilate(img_canny, kernel, iterations=2)
    return cv2.erode(img_dilate, kernel, iterations=1)
    定义一个函数get_pts(),它接收处理后的图像并返回面积最大的轮廓凸包的4个点。为了从凸包中得到 4 个点,我们使用cv2.approxPolyDP() 方法:
def get_pts(img):
    contours, _ = cv2.findContours(img, cv2.RETR_TREE, cv2.CHAIN_APPROX_NONE)
    cnt = max(contours, key=cv2.contourArea)
    peri = cv2.arcLength(cnt, True)
    return cv2.approxPolyDP(cv2.convexHull(cnt), 0.04 * peri, True)
    定义一个列表,files,其中包含您要从中提取照片的每个文件的名称,以及您希望生成的图像的尺寸,widthheight
files = ["1.jpg", "2.jpg", "3.jpg"]
width, height = 350, 450
    使用上面定义的维度,为即将被检测到的 4 个坐标中的每一个定义一个矩阵以映射到:
pts2 = np.float32([[width, 0], [0, 0], [width, height], [0, height]])
    循环遍历每个文件名,将每个文件读入一个BGR图像数组,得到图像内照片的4个点,使用cv2.getPerspectiveTransform()方法得到翘曲的解矩阵,最后翘曲照片部分使用cv2.warpPerspective() 方法的解决方案矩阵的图像:
for file in files:
    img = cv2.imread(file)
    pts1 = get_pts(process(img)).squeeze()
    pts1 = np.float32(pts1[np.lexsort(pts1.T)])
    matrix = cv2.getPerspectiveTransform(pts1, pts2)
    out = cv2.warpPerspective(img, matrix, (width, height))[5:-5, 5:-5]
    cv2.imshow(file, out)
    最后,添加延迟,然后销毁所有窗口:
cv2.waitKey(0)
cv2.destroyAllWindows()

【讨论】:

以上是关于如何使用 OpenCV 裁剪图像中的白色斑块并制作护照大小的照片的主要内容,如果未能解决你的问题,请参考以下文章

如何在C++中用文字制作图片的白色背景

如何从图像中裁剪或删除白色背景

如何使用 python opencv 裁剪图像中最大的对象?

如何使用 C++ 搜索具有非白色背景的图像?

使用 OpenCV 如何根据 x 和 y 坐标裁剪图像,并允许 x 和 y 坐标成为裁剪的中心?

如何从网络摄像头 OpenCV 裁剪圆形图像并删除背景