Python 在图像中的单词之间产生更大的间隙

Posted 2023-04-17

技术标签:

【中文标题】Python 在图像中的单词之间产生更大的间隙【英文标题】：Python make bigger gaps between words in an image 【发布时间】：2021-11-20 09:31:18 【问题描述】：

我有以下图片：

from PIL import Image
img = Image.open("without_space.png")
img.show()

我希望增加单词之间的间隔，使其看起来像这样：

我想过将图像转换为 NumPy：

img = numpy.ndarray(img)

比增加阵列的x轴和y轴为增加间隙留出空间：

def increase_padding(img):
    np_arr = np.asarray(img)

    shape = np_arr.shape

    y = shape[0]
    colors = shape[2]
    zeros = np.zeros([y,20,colors], dtype=np.uint8)
    zeros[:,:,3] = 255
    np_arr = np.append(np_arr,zeros, axis=1)
    np_arr = np.append(zeros, np_arr, axis=1)

    shape = np_arr.shape

    x = shape[1]
    colors = shape[2]

    zeros = np.zeros([20,x,colors], dtype=np.uint8)
    zeros[:,:,3] = 255
    np_arr = np.append(np_arr,zeros, axis=0)
    np_arr = np.append(zeros, np_arr, axis=0)

    return np_arr

这是结果：

 img = increase_padding(img)
 img.show()

图像有更多空间来分隔单词，但现在我被卡住了。有什么想法吗？

【问题讨论】：

您需要一些方法来识别单词，然后在单词之间插入空格，而不是在图像的左/右和顶部/底部。一般来说，我认为在图像中找到单词并不是一件容易的事，但在这个例子中，看起来一些简单的规则可能会起作用（特别是如果图像是黑白的，即颜色值为 0 或 255 和没有别的）。有一个numpy 函数用于填充np.pad。您的问题的解决方案必须识别图像中的字母和单词。这是一个复杂的程序，不是一个可以回答的问题。检测文字的bounding boxes其实比我想象的要容易。也许你可以说一下你这样做的实际目的是什么？可能有更好的方法。例如，你知道图片中的文字吗？我不知道前面的文字。但是所有文本都采用我上面显示的格式。白底黑 【参考方案1】：

你的填充机制不太好，我的版本如下

import cv2
import numpy as np

ROI_number = 0
factor = 40
decrement = 20
margin = 3

#sorting code source
#https://gist.github.com/divyaprabha123/bfa1e44ebdfc6b578fd9715818f07aec
def sort_contours(cnts, method="left-to-right"):
    '''
    sort_contours : Function to sort contours
    argument:
        cnts (array): image contours
        method(string) : sorting direction
    output:
        cnts(list): sorted contours
        boundingBoxes(list): bounding boxes
    '''
    # initialize the reverse flag and sort index
    reverse = False
    i = 0

    # handle if we need to sort in reverse
    if method == "right-to-left" or method == "bottom-to-top":
        reverse = True

    # handle if we are sorting against the y-coordinate rather than
    # the x-coordinate of the bounding box
    if method == "top-to-bottom" or method == "bottom-to-top":
        i = 1

    # construct the list of bounding boxes and sort them from top to
    # bottom
    boundingBoxes = [cv2.boundingRect(c) for c in cnts]
    (cnts, boundingBoxes) = zip(*sorted(zip(cnts, boundingBoxes),
        key=lambda b:b[1][i], reverse=reverse))

    # return the list of sorted contours and bounding boxes
    return (cnts, boundingBoxes)

image = cv2.imread("test.png")

#use a black container of same shape to construct new image with gaps
container = np.zeros(image.shape, np.uint8)

gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (9, 9), 0)
thresh = cv2.adaptiveThreshold(blur,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY_INV, 11, 25)

# Dilate to combine adjacent text contours
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3,3))
dilate = cv2.dilate(thresh, kernel, iterations=4)

# Find contours, highlight text areas, and extract ROIs
cnts = cv2.findContours(dilate, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]

#sort so that order remain preserved
cnts = sort_contours(cnts)[0]

for c in cnts:
    ROI_number += 1
    area = cv2.contourArea(c)
    print(area)
    x, y, w, h = cv2.boundingRect(c)
    
    x -= margin
    y -= margin
    w += margin
    h += margin

    #extract region of interest e.g. the word
    roi = image[y : y + h, x : x + w].copy()
    factor -= decrement
        
    x = x - factor

    #copy the words from the original image to container image with gap factor
    container[y : y + h, x : x + w] = roi

cv2.imshow('image', container)
cv2.waitKey()

输出如下，我假设对于其他图像，您必须优化此代码以自动找到最佳阈值。

我所做的是跟随

使用阈值提取轮廓从左到右对等高线进行排序以获得正确的单词顺序创建空容器（与原始大小相同的新图像）将所有单词从原始容器复制到带有填充的新容器

【讨论】：

【参考方案2】：

要移动位图中的文字，您需要确定与这些区域相对应的边界框。

这些边界框的水平尺寸可以通过水平间距来识别。

您的第一步是沿水平轴“聚合”图像，取最大值（这将标记包含至少一个像素的所有列）。

horizontal = np_arr.max(axis=0)

然后您需要确定该数组中至少为给定长度的 0 次运行。这些将是单词之间的边距和空格。（阈值需要足够高才能跳过字母之间的空格。）

最后，这些 0-runs 之间的部分将是包含单词的区域。

【讨论】：

以上是关于Python 在图像中的单词之间产生更大的间隙的主要内容，如果未能解决你的问题，请参考以下文章