堆叠不同大小图像的 Numpy 数组

Posted 2023-02-22

技术标签:

【中文标题】堆叠不同大小图像的 Numpy 数组【英文标题】：Stacking Numpy arrays of different-sized images 【发布时间】：2021-04-26 20:34:11 【问题描述】：

我正在使用 OpenCV 创建一组图像，以便在 TensorFlow 中进行分析。

我创建了以下函数：

def files_to_img_array(path, files_list):
    '''
    Reads a list of image files and creates a Numpy array.
    '''
    # Instantiate arrays
    files = [path+file for file in files_list]
    img_array = np.zeros(72000000) # for flattened 4000x6000 images
    image_names = []

    for file in tqdm.tqdm(files_list):
        full_file = path+file
        image_names.append(file.split('.')[0])
        img = cv2.imread(full_file, 1)
        print(img.shape)
        img = img.flatten()
        
        img_array = np.vstack([img_array, img])
    img_array = img_array[1:] # remove instantiating zeroes
    return img_array

问题是图片大小不统一：

 0%|                                     | 0/10 [00:00<?, ?it/s](4000, 6000, 3)
 10%|███████▊                    | 1/10 [00:00<00:03,  2.64it/s](4000, 6000, 3)
 20%|███████████████▌            | 2/10 [00:00<00:03,  2.51it/s](2848, 4288, 3)
 20%|███████████████▌            | 2/10 [00:00<00:03,  2.18it/s]
Traceback (most recent call last):
...
ValueError: all the input array dimensions for the concatenation axis
must match exactly, but along dimension 1, the array at index 0 has
size 72000000 and the array at index 1 has size 36636672

从编程和图像处理的角度来看，我真的不确定如何处理这个问题。有没有人有关于如何填充这些不同大小的图像的建议，或者 OpenCV 中是否有可以处理这个问题的东西？（我也很高兴使用 PIL，我没有与 OpenCV 结婚。）

【问题讨论】：

我不知道任何简单的解决方案。我过去所做的是检查所有尺寸并找到每个宽度和高度的最大值。然后用黑色或透明度填充图像以使它们具有相同的尺寸。您可以使用 cv2.copyMakeBorder 填充图像。 cv2 应该有一个 resize 方法。我认为调整图像大小比填充更好，但是您的 tensorflow 文档（或其他 ML）应该讨论此类问题。 @fmw42 我也想过这样做。这样做的问题是它会抛出直方图分析，所以我只能想象它会如何对分类器造成严重破坏...... @Yehuda 好的。但是您应该在最初的问题中这么说。从我读过的机器学习论文中，人们通常会在将图像传递给机器学习算法之前将其调整为常见尺寸。 【参考方案1】：

这里是如何在 Python/OpenCV 中使用透明填充垂直堆叠任意大小的图像。

输入图像：

import cv2
import numpy as np

# load images
img1 = cv2.imread("lena.jpg")
w1 = img1.shape[1]

img2 = cv2.imread("barn.jpg")
w2 = img2.shape[1]

img3 = cv2.imread("monet2.jpg")
w3 = img3.shape[1]

# get maximum width
ww = max(w1, w2, w3)

# pad images with transparency in width
img1 = cv2.cvtColor(img1, cv2.COLOR_BGR2BGRA)
img2 = cv2.cvtColor(img2, cv2.COLOR_BGR2BGRA)
img3 = cv2.cvtColor(img3, cv2.COLOR_BGR2BGRA)
img1 = cv2.copyMakeBorder(img1, 0, 0, 0, ww-w1, borderType=cv2.BORDER_CONSTANT, value=(0,0,0,0))
img2 = cv2.copyMakeBorder(img2, 0, 0, 0, ww-w2, borderType=cv2.BORDER_CONSTANT, value=(0,0,0,0))
img3 = cv2.copyMakeBorder(img3, 0, 0, 0, ww-w3, borderType=cv2.BORDER_CONSTANT, value=(0,0,0,0))

# stack images vertically
result = cv2.vconcat([img1, img2, img3])

# write result to disk
cv2.imwrite("image_stack.png", result)

cv2.imshow("RESULT", result)
cv2.waitKey(0)
cv2.destroyAllWindows()

结果：

【讨论】：

感谢您的回答——不过，我特意尝试堆叠 cv2.imread() 的 Numpy 数组输出。我不明白为什么会有不同。您将图像放入列表中。您可以在阅读完所有列表后循环遍历您的列表，而不是尝试内联堆叠。或者您可以在检查新图像和前一个堆栈的宽度后将新图像堆叠到前一个堆栈上。但我对张量流及其要求知之甚少。酷！基本上，您通过填充使所有图像大小相同？【参考方案2】：

感谢@hpaulj 对导致我进行调查和这个答案的问题的评论。

以下代码依赖于 Keras 和底层 PIL：

import PIL
import tensorflow
from tensorflow.keras.preprocessing.image import load_img, img_to_array
import concurrent.futures

def keras_pipeline(file):
    TARGET_SIZE = (100,150)
    img = load_img(file, target_size=TARGET_SIZE)
    img_array = img_to_array(img)
    return img_array
 
def files_to_array(path, files_list):
    files = [path+file for file in files_list]
    with concurrent.futures.ProcessPoolExecutor() as executor:
        img_map = executor.map(keras_pipeline, files)
    return img_map

keras_pipeline() 为每个图像创建一个转换管道。 files_to_array() 在每个图像上映射转换管道并返回一个生成器。然后可以使用 np.hstack() 将该生成器作为 Numpy 数组传递：

for img in img_map:
    existing_array = np.hstack([existing_array, img])

【讨论】：

以上是关于堆叠不同大小图像的 Numpy 数组的主要内容，如果未能解决你的问题，请参考以下文章