如何使用 OpenCV 裁剪图像中的白色斑块并制作护照大小的照片
Posted
技术标签:
【中文标题】如何使用 OpenCV 裁剪图像中的白色斑块并制作护照大小的照片【英文标题】:How to crop white patches in image and make passport size photo using OpenCV 【发布时间】:2021-05-10 14:48:23 【问题描述】:我是 OpenCV 的新手,我有一些图像需要裁剪成完美的护照尺寸照片。我有成千上万的图像需要像这样自动裁剪和拉直。如果图像太模糊且无法裁剪,我需要将其复制到被拒绝的文件夹中。我尝试使用haar cascade,但这种方法只给了我面子。但我需要一张带有照片裁剪背景的脸。谁能告诉我如何在 OpenCV 或任何代码中编写代码?
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
faceCascade = cv2.CascadeClassifier(
cv2.data.haarcascades + "haarcascade_frontalface_default.xml")
faces = faceCascade.detectMultiScale(
gray,
scaleFactor=1.3,
minNeighbors=3,
minSize=(30, 30)
)
if(len(faces) == 1):
for (x, y, w, h) in faces:
if(x-w < 100 and y-h < 100):
ystart = int(y-y*int(y1)/100)
xstart = int(x-x*int(x1)/100)
yend = int(h+h*int(y1)/100)
xend = int(w+w*int(y2)/100)
roi_color = img[ystart:y + yend, xstart:x + xend]
cv2.imwrite(path, roi_color)
else:
rejectedCount += 1
cv2.imwrite(path, img)
之前
之后
【问题讨论】:
【参考方案1】:我会按如下方式处理您的问题:
-
首先我们需要抓住我们感兴趣的点
了解普通护照头像的大小(以像素为单位)
如何抓取兴趣点。
我们有更多方法:
-
您可以使用
windows
绘画应用程序
但为了更加程序化,我们可以使用cv2
。我将向你展示如何使用 cv2 做到这一点。
另请注意,这不会产生高分辨率图像,您必须自己玩代码。
# imports
import numpy as np
import cv2
width = height = 600 # normal passport photo size in pixels
# global variable that will update the points when we clicked on the image
pt1 = []
pt2 = np.float32([[0, 0], [height, 0], [0, width], [height, width]])
def mouseEvent(event, x, y, flags, param):
if event == cv2.EVENT_LBUTTONDOWN:
global pt1
if len(pt1) == 4:
pt1 = []
else:
pt1.append([x, y])
while 1:
image = cv2.imread("img.jpg", cv2.IMREAD_UNCHANGED)
cv2.imshow("Original Image", image)
cv2.setMouseCallback("Original Image", mouseEvent)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
if len(pt1) == 4:
break
然后我们使用两个cv2
函数,它们是getPerspectiveTransform
和warpPerspective
。 getPerspectiveTransform()
将接受两个点,我们的 pt1
和 pt2
然后我们将调用 warpPerspective()
函数并传递三个位置参数,图像、矩阵和图像形状:
image = cv2.imread("img.jpg", 0)
matrix = cv2.getPerspectiveTransform(np.float32(pt1), pt2)
image = cv2.warpPerspective(image, matrix, image.shape)
cv2.imshow("Wrap Perspective", image)
cv2.waitKey(0)
我知道这不是一个好的解释,但你明白了。整个代码程序如下所示:
import numpy as np
import cv2
width = height = 600 # normal passport photo size in pixels
pt1 = []
pt2 = np.float32([[0, 0], [height, 0], [0, width], [height, width]])
def mouseEvent(event, x, y, flags, param):
if event == cv2.EVENT_LBUTTONDOWN:
global pt1
if len(pt1) == 4:
pt1 = []
else:
pt1.append([x, y])
while 1:
image = cv2.imread("img.jpg", cv2.IMREAD_UNCHANGED)
cv2.imshow("Original Image", image)
cv2.setMouseCallback("Original Image", mouseEvent)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
if len(pt1) == 4:
break
image = cv2.imread("img.jpg", 0)
matrix = cv2.getPerspectiveTransform(np.float32(pt1), pt2)
image = cv2.warpPerspective(image, matrix, image.shape)
cv2.imshow("Wrap Perspective", image)
cv2.waitKey(0)
-
当您运行以下代码时,将显示一个图像。
要使用此程序,您必须从
A-D
依次单击四个点。例如,如果这是您的图片:
------------------
| (a) (b)|
| |
| |
| |
| |
| |
| (c) (d)|
-------------------
其中 a、b、c 和 d 是您在图像crop
上感兴趣的点。
演示
点击1
然后2
然后3
最后4
得到上面的结果
【讨论】:
我有 10000 多张图片需要裁剪。我不认为使用鼠标手动执行它不是一个可行的想法。有没有办法自动检测点? 你可以使用人工智能,或者cascadeClassifiers
,这是更好的方法
“使用 AI”是一个非答案(相当于“使用魔法”),级联分类器完全不适合挑选那些角点,因为这些角点会发生一些旋转。众所周知,当事物旋转时,级联分类器会失败。【参考方案2】:
这是在 Python/OpenCV 中通过键入图像周围的黑线来提取照片的一种方法。
输入:
- Read the input
- Pad the image with white so that the lines can be extended until intersection
- Threshold on black to extract the lines
- Apply morphology close to try to connect the lines somewhat
- Get the contours and filter on area drawing the contours on a black background
- Apply morphology close again to fill the line centers
- Skeletonize to thin the lines
- Get the Hough lines and draw them as white on a black background
- Floodfill the center of the rectangle of lines to fill with mid-gray. Then convert that image to binary so that the gray becomes white and all else is black.
- Get the coordinates of all non-black pixels and then from the coordinates get the rotated rectangle.
- Use the angle and center of the rotated rectangle to unrotated both the padded image and this mask image via an Affine warp
- (Alternately, get the four corners of the rotated rectangle from the mask and then project that to the padded input domain using the affine matrix)
- Get the coordinates of all non-black pixels in the unrotated mask and compute its rotated rectangle.
- Get the bounding box of the (un-)rotated rectangle
- Use those bounds to crop the padded image
- Save the results
import cv2
import numpy as np
import math
from skimage.morphology import skeletonize
# read image
img = cv2.imread('passport.jpg')
ht, wd = img.shape[:2]
# pad image with white by 20% on all sides
padpct = 20
xpad = int(wd*padpct/100)
ypad = int(ht*padpct/100)
imgpad = cv2.copyMakeBorder(img, ypad, ypad, xpad, xpad, borderType=cv2.BORDER_CONSTANT, value=(255,255,255))
ht2, wd2 = imgpad.shape[:2]
# threshold on black
low = (0,0,0)
high = (20,20,20)
# threshold
thresh = cv2.inRange(imgpad, low, high)
# apply morphology to connect the white lines
kernel = np.ones((5,5), np.uint8)
morph = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, kernel)
# get contours
contours = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
contours = contours[0] if len(contours) == 2 else contours[1]
# filter on area
mask = np.zeros((ht2,wd2), dtype=np.uint8)
for cntr in contours:
area = cv2.contourArea(cntr)
if area > 20:
cv2.drawContours(mask, [cntr], 0, 255, 1)
# apply morphology to connect the white lines and divide by 255 to make image in range 0 to 1
kernel = np.ones((5,5), np.uint8)
bmask = cv2.morphologyEx(mask, cv2.MORPH_CLOSE, kernel)/255
# apply thinning (skeletonizing)
skeleton = skeletonize(bmask)
skeleton = (255*skeleton).clip(0,255).astype(np.uint8)
# get hough lines
line_img = np.zeros_like(imgpad, dtype=np.uint8)
lines= cv2.HoughLines(skeleton, 1, math.pi/180.0, 90, np.array([]), 0, 0)
a,b,c = lines.shape
for i in range(a):
rho = lines[i][0][0]
theta = lines[i][0][1]
a = math.cos(theta)
b = math.sin(theta)
x0, y0 = a*rho, b*rho
pt1 = ( int(x0+1000*(-b)), int(y0+1000*(a)) )
pt2 = ( int(x0-1000*(-b)), int(y0-1000*(a)) )
cv2.line(line_img, pt1, pt2, (255, 255, 255), 1)
# floodfill with mid-gray (128)
xcent = int(wd2/2)
ycent = int(ht2/2)
ffmask = np.zeros((ht2+2, wd2+2), np.uint8)
mask2 = line_img.copy()
mask2 = cv2.floodFill(mask2, ffmask, (xcent,ycent), (128,128,128))[1]
# convert mask2 to binary
mask2[mask2 != 128] = 0
mask2[mask2 == 128] = 255
mask2 = mask2[:,:,0]
# get coordinates of all non-zero pixels
# NOTE: must transpose since numpy coords are y,x and opencv uses x,y
coords = np.column_stack(np.where(mask2.transpose() > 0))
# get rotated rectangle from coords
rotrect = cv2.minAreaRect(coords)
(center), (width,height), angle = rotrect
# from https://www.pyimagesearch.com/2017/02/20/text-skew-correction-opencv-python/
# the `cv2.minAreaRect` function returns values in the
# range [-90, 0); as the rectangle rotates clockwise the
# returned angle trends to 0 -- in this special case we
# need to add 90 degrees to the angle
if angle < -45:
angle = -(90 + angle)
# otherwise, just take the inverse of the angle to make
# it positive
else:
angle = -angle
# compute correction rotation
rotation = -angle - 90
# compute rotation affine matrix
M = cv2.getRotationMatrix2D(center, rotation, scale=1.0)
# unrotate imgpad and mask2 using affine warp
rot_img = cv2.warpAffine(imgpad, M, (wd2, ht2), flags=cv2.INTER_CUBIC, borderValue=(0,0,0))
rot_mask2= cv2.warpAffine(mask2, M, (wd2, ht2), flags=cv2.INTER_CUBIC, borderValue=(0,0,0))
# get coordinates of all non-zero pixels
# NOTE: must transpose since numpy coords are y,x and opencv uses x,y
coords2 = np.column_stack(np.where(rot_mask2.transpose() > 0))
# get bounding box
x,y,w,h = cv2.boundingRect(coords2)
print(x,y,w,h)
# crop rot_img
result = rot_img[y:y+h, x:x+w]
# save resulting images
cv2.imwrite('passport_pad.jpg',imgpad)
cv2.imwrite('passport_thresh.jpg',thresh)
cv2.imwrite('passport_morph.jpg',morph)
cv2.imwrite('passport_mask.jpg',mask)
cv2.imwrite('passport_skeleton.jpg',skeleton)
cv2.imwrite('passport_line_img.jpg',line_img)
cv2.imwrite('passport_mask2.jpg',mask2)
cv2.imwrite('passport_rot_img.jpg',rot_img)
cv2.imwrite('passport_rot_mask2.jpg',rot_mask2)
cv2.imwrite('passport_result.jpg',result)
# show thresh and result
cv2.imshow("imgpad", imgpad)
cv2.imshow("thresh", thresh)
cv2.imshow("morph", morph)
cv2.imshow("mask", mask)
cv2.imshow("skeleton", skeleton)
cv2.imshow("line_img", line_img)
cv2.imshow("mask2", mask2)
cv2.imshow("rot_img", rot_img)
cv2.imshow("rot_mask2", rot_mask2)
cv2.imshow("result", result)
cv2.waitKey(0)
cv2.destroyAllWindows()
填充图像:
阈值图像:
形态清洁图像:
面具1图片:
骨架图像:
(霍夫)线图像:
填充线条图像 - Mask2:
未旋转的填充图像:
未旋转的 Mask2 图像:
裁剪图像:
【讨论】:
感谢您的回答。我尝试了此代码,但它仅适用于此图像。它不适用于其他图像。我已经添加了另外两张有问题的图片,请检查。让我知道我需要在代码中做的更改。 很难让它适用于所有图像。在您的所有图像中,黑线都不够突出或不够暗。您在一张图像时也有多余的黑线。此外,图像被扭曲,因此黑线不是直的,因此霍夫线不会检测到单条线,而是每边检测到多条线。【参考方案3】:如果所有照片周围都有细的白黑边框,你可以
-
对图片设置阈值
获取所有轮廓和
选择那些轮廓
有正确的梯度
足够大
通过
approxPolyDP
时减少到4个角
获得一个定向边界框
构造仿射变换
应用仿射变换
如果这些照片不是扫描的,而是用相机从某个角度(不是自上而下)拍摄的,则需要使用根据角点本身计算的透视变换。 p>
如果照片不是平的而是翘曲的,那就是完全不同的问题了。
import numpy as np
import cv2 as cv
im = cv.imread("Zh8QV.jpg")
gray = cv.cvtColor(im, cv.COLOR_BGR2GRAY)
gray = 255 - gray # invert so findContours' implicit black border doesn't bother us
height, width = gray.shape
minarea = (height * width) * 0.20
# (th_level, thresholded) = cv.threshold(gray, thresh=128, maxval=255, type=cv.THRESH_OTSU)
# threshold relative to estimated brightness of "white"
th_level = 255 - (255 - np.median(gray)) * 0.98
(th_level, thresholded) = cv.threshold(gray, thresh=th_level, maxval=255, type=cv.THRESH_BINARY)
(contours, hierarchy) = cv.findContours(thresholded, mode=cv.RETR_LIST, method=cv.CHAIN_APPROX_SIMPLE)
# black-to-white contours have negative area...
#areas = sorted([cv.contourArea(c, oriented=True) for c in contours])
large_areas = [ c for c in contours if cv.contourArea(c, oriented=True) <= -minarea ]
quads = [
c for c in large_areas
if len(cv.approxPolyDP(c, epsilon=0.02 * cv.arcLength(c, True), closed=True)) == 4
]
# if there is no quad, or multiple, that's an error (for this example)
assert len(quads) == 1, quads
[quad] = quads
bbox = cv.minAreaRect(quad)
(bcenter, bsize, bangle) = bbox
bcenter = np.array(bcenter)
bsize = np.array(bsize)
# keep orientation upright, fix up bbox size
(rot90, bangle) = divmod(bangle + 45, 90)
bangle -= 45
if rot90 % 2 != 0:
bsize = bsize[::-1]
# construct affine transformation
M1 = np.eye(3)
M1[0:2,2] = -bcenter
R = np.eye(3)
R[0:2] = cv.getRotationMatrix2D(center=(0,0), angle=bangle, scale=1.0)
M2 = np.eye(3)
M2[0:2,2] = +bsize * 0.5
M = M2 @ R @ M1
bwidth, bheight = np.ceil(bsize)
dsize = (int(bwidth), int(bheight))
output = cv.warpAffine(im, M[0:2], dsize=dsize, flags=cv.INTER_CUBIC)
cv.imshow("output", output)
cv.waitKey(-1)
cv.destroyWindow("output")
【讨论】:
兄弟代码没有显示输出图像imgur.com/zzQ731c 当我将整个代码放在 try-catch 块中时,它给了我一个错误。你能修复代码吗? 你能告诉我你用的是什么图片吗? 我用的是你用过的同一张图片 感谢您指出问题。它已修复,答案已更新。【参考方案4】:我要做的是以下 3 个步骤(我不会为您编写代码,抱歉,如果您在其中一个阶段需要帮助,我很乐意为您提供帮助):
使用Hough transform
检测图片中最强的4条线。
计算线的 4 个交点
应用透视变换。
您应该拥有所需的裁剪图像。
【讨论】:
【参考方案5】:概念
处理每个图像以增强照片的边缘。
通过首先找到面积最大的轮廓,获取其凸包并逼近凸包,直到只剩下4个点,来获取每个处理后图像的照片的4个角。
根据检测到的 4 个角扭曲每个图像。
代码
import cv2
import numpy as np
def process(img):
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
img_blur = cv2.GaussianBlur(img_gray, (1, 1), 1)
img_canny = cv2.Canny(img_blur, 350, 150)
kernel = np.ones((3, 3))
img_dilate = cv2.dilate(img_canny, kernel, iterations=2)
return cv2.erode(img_dilate, kernel, iterations=1)
def get_pts(img):
contours, _ = cv2.findContours(img, cv2.RETR_TREE, cv2.CHAIN_APPROX_NONE)
cnt = max(contours, key=cv2.contourArea)
peri = cv2.arcLength(cnt, True)
return cv2.approxPolyDP(cv2.convexHull(cnt), 0.04 * peri, True)
files = ["1.jpg", "2.jpg", "3.jpg"]
width, height = 350, 450
pts2 = np.float32([[width, 0], [0, 0], [width, height], [0, height]])
for file in files:
img = cv2.imread(file)
pts1 = get_pts(process(img)).squeeze()
pts1 = np.float32(pts1[np.lexsort(pts1.T)])
matrix = cv2.getPerspectiveTransform(pts1, pts2)
out = cv2.warpPerspective(img, matrix, (width, height))[5:-5, 5:-5]
cv2.imshow(file, out)
cv2.waitKey(0)
cv2.destroyAllWindows()
输出
我将每个输出彼此相邻放置以适合一张图像:
解释
-
导入必要的库:
import cv2
import numpy as np
-
定义一个函数
process()
,它接收一个BGR图像数组并返回用Canny edge detector处理的图像,以便以后更准确地检测每张照片的边缘。如果需要,可以调整函数中使用的值以更适合其他图像:
def process(img):
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
img_blur = cv2.GaussianBlur(img_gray, (1, 1), 1)
img_canny = cv2.Canny(img_blur, 350, 150)
kernel = np.ones((3, 3))
img_dilate = cv2.dilate(img_canny, kernel, iterations=2)
return cv2.erode(img_dilate, kernel, iterations=1)
-
定义一个函数
get_pts()
,它接收处理后的图像并返回面积最大的轮廓凸包的4个点。为了从凸包中得到 4 个点,我们使用cv2.approxPolyDP()
方法:
def get_pts(img):
contours, _ = cv2.findContours(img, cv2.RETR_TREE, cv2.CHAIN_APPROX_NONE)
cnt = max(contours, key=cv2.contourArea)
peri = cv2.arcLength(cnt, True)
return cv2.approxPolyDP(cv2.convexHull(cnt), 0.04 * peri, True)
-
定义一个列表,
files
,其中包含您要从中提取照片的每个文件的名称,以及您希望生成的图像的尺寸,width
和 height
:
files = ["1.jpg", "2.jpg", "3.jpg"]
width, height = 350, 450
-
使用上面定义的维度,为即将被检测到的 4 个坐标中的每一个定义一个矩阵以映射到:
pts2 = np.float32([[width, 0], [0, 0], [width, height], [0, height]])
-
循环遍历每个文件名,将每个文件读入一个BGR图像数组,得到图像内照片的4个点,使用
cv2.getPerspectiveTransform()
方法得到翘曲的解矩阵,最后翘曲照片部分使用cv2.warpPerspective() 方法的解决方案矩阵的图像:
for file in files:
img = cv2.imread(file)
pts1 = get_pts(process(img)).squeeze()
pts1 = np.float32(pts1[np.lexsort(pts1.T)])
matrix = cv2.getPerspectiveTransform(pts1, pts2)
out = cv2.warpPerspective(img, matrix, (width, height))[5:-5, 5:-5]
cv2.imshow(file, out)
-
最后,添加延迟,然后销毁所有窗口:
cv2.waitKey(0)
cv2.destroyAllWindows()
【讨论】:
以上是关于如何使用 OpenCV 裁剪图像中的白色斑块并制作护照大小的照片的主要内容,如果未能解决你的问题,请参考以下文章
如何使用 python opencv 裁剪图像中最大的对象?