在 Python 多处理进程中运行较慢的 OpenCV 代码片段

Posted 2023-02-16

技术标签:

【中文标题】在 Python 多处理进程中运行较慢的 OpenCV 代码片段【英文标题】：OpenCV code snippet running slower inside Python multiprocessing process 【发布时间】：2021-01-30 13:59:33 【问题描述】：

我正在使用多处理进行一些测试以并行化人脸检测和识别，我遇到了一个奇怪的行为，其中 detectMultiScale()（执行人脸检测）在子进程中的运行速度比在父进程中慢（只是调用函数）。

因此，我编写了下面的代码，其中将 10 个图像排入队列，然后使用以下两种方法之一顺序执行人脸检测：仅调用检测函数或在单个新进程中运行它。对于每个 detectMultiScale() 调用，都会打印执行时间。执行此代码在第一种方法中平均每次调用为 0.22 秒，在第二种方法中为 0.54 秒。此外，第二种方法处理 10 张图像的总时间也更长。

我不知道为什么相同的代码 sn-p 在新进程中运行速度较慢。如果总时间更长我会理解（考虑到设置新进程的开销），但我不明白。作为记录，我在 Raspberry Pi 3B+ 中运行它。

import cv2
import multiprocessing
from time import time, sleep

def detect(face_cascade, img_queue, bnd_queue):
    while True:
        image = img_queue.get()
        if image is not None:
            gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
            ti = time()
            ########################################
            faces = face_cascade.detectMultiScale(
                                gray_image,
                                scaleFactor=1.1,
                                minNeighbors=3,
                                minSize=(130, 130))
            ########################################
            tf = time()
            print('det time: ' + str(tf-ti))
                            
            if len(faces) > 0:
                max_bounds = (0,0,0,0)
                max_size = 0
                for (x,y,w,h) in faces:
                     if w*h > max_size:
                         max_size = w*h
                         max_bounds = (x,y,w,h)
            img_queue.task_done()
            bnd_queue.put('bound')
        else:
            img_queue.task_done()
            break


face_cascade = cv2.CascadeClassifier('../lbpcascade_frontalface_improved.xml')
cam = cv2.VideoCapture(0)
cam.set(cv2.CAP_PROP_FRAME_WIDTH, 2592)
cam.set(cv2.CAP_PROP_FRAME_HEIGHT, 1944)
cam.set(cv2.CAP_PROP_BUFFERSIZE, 1)

img_queue = multiprocessing.JoinableQueue()

i = 0
while i < 10:
    is_there_frame, image = cam.read()
    if is_there_frame:
        image = image[0:1944, 864:1728]
        image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
        img_queue.put(image)
        i += 1

bnd_queue = multiprocessing.JoinableQueue()
num_process = 1

ti = time()
# MULTIPROCESSING PROCESS APPROACH
for _ in range(num_process):
    p = multiprocessing.Process(target=detect, args=(face_cascade, img_queue, bnd_queue))
    p.start()

for _ in range(num_process):
    img_queue.put(None)
#     
# FUNCTION CALL APPROACH
#img_queue.put(None)
#while not img_queue.empty():
#    detect(face_cascade, img_queue, bnd_queue)

img_queue.join()
tf = time()

print('TOTAL TIME: ' + str(tf-ti))

while not bnd_queue.empty():
    bound = bnd_queue.get()
    if bound != 'bound':
        print('ERROR')
    bnd_queue.task_done()

【问题讨论】：

【参考方案1】：

我遇到了同样的问题，我认为原因是任务在某种程度上受 I/O 限制，而且多处理本身也产生了开销。你也可以在这里阅读文章https://www.pyimagesearch.com/2019/09/09/multiprocessing-with-opencv-and-python/ 你用 detectMultiScale() 方法特别提到的问题和我的一样。我也尝试过使用序列化并使变量成为全局变量以及类级别的变量，但没有任何帮助..

【讨论】：

以上是关于在 Python 多处理进程中运行较慢的 OpenCV 代码片段的主要内容，如果未能解决你的问题，请参考以下文章