使用 scikit image 将图像缩小为文本内容

Posted 2023-02-21

技术标签:

【中文标题】使用 scikit image 将图像缩小为文本内容【英文标题】：Reduce the image to the text contents using scikit image 【发布时间】：2018-01-06 03:02:48 【问题描述】：

这是我要从中提取文本的图像。

如何去掉黑边，把图片缩小到只有50？

我采取的方法：

我尝试使用角检测器（角峰和角哈里斯）并从左侧选择前 2 个坐标，从右侧选择后 2 个坐标。使用这 4 个坐标，我裁剪了图像，并在所有方面进一步缩小了 5。

这样做当然不是有效的方法。我也看了一些细分。无法正确处理。我正在使用 scikit 图像来解决这个问题。

【问题讨论】：

【参考方案1】：

使用角点可能不起作用，因为角点也可能出现在字符中。

这是我尝试使用如下所述的霍夫线：

1) 先腐蚀图像，尽量减少线条和字符之间的间隙

2) 使用霍夫线检测算法检测和删除线

3) 放大图像以获得清晰的字符

4) 现在我们已经将字符和行分开了，所以我们可以通过查找连通分量来删除行。

以下是 Python 中相同的代码实现：

img = cv2.imread('D:\Image\st1.png',0)
ret, thresh = cv2.threshold(img, 150, 255, cv2.THRESH_BINARY_INV)

#dilate the image to reduce gap between characters and lines and get hough lines correctly
kernel = np.ones((3,3),np.uint8)
erosion = cv2.erode(thresh,kernel,iterations = 1)

#find canny edge image
canny = cv2.Canny(erosion,100,200)

minLineLength=img.shape[1]/4
lines = cv2.HoughLinesP(image=canny,rho=0.02,theta=np.pi/500, threshold=10,lines=np.array([]), minLineLength=minLineLength,maxLineGap=10)

a,b,c = lines.shape
# delete the lines
for i in range(a):
    cv2.line(erosion, (lines[i][0][0], lines[i][0][1]), (lines[i][0][2], lines[i][0][3]), 0, 3, cv2.LINE_AA)

#erode the image 
kernel = np.ones((3,3),np.uint8)
erosion = cv2.dilate(erosion, kernel, iterations=1)

# find connected components
connectivity = 4
nb_components, output, stats, centroids = cv2.connectedComponentsWithStats(erosion, connectivity, cv2.CV_32S)
sizes = stats[1:, -1]; nb_components = nb_components - 1
min_size = 250 #threshhold value for lines length 
img2 = np.zeros((output.shape), np.uint8)
for i in range(0, nb_components):
    if sizes[i] >= min_size:
        img2[output == i + 1] = 255 #delete the line components

img = cv2.bitwise_not(img2)

输出图像：

【讨论】：

以上是关于使用 scikit image 将图像缩小为文本内容的主要内容，如果未能解决你的问题，请参考以下文章

AI常用框架和工具丨6. 图像处理库Scikit-image

如何在python中使用scikit-image greycomatrix（）函数？

scikit-image：遥感图像geotiff格式转mat格式

使用 SciKit-Image 和 SciKit-Learn 进行图像预处理和聚类 - 需要一些建议

Python图像处理Python图像处理库应用

如何将 base64 字符串格式的图像转换为清晰图像缩小器期望的数据类型？