如何在Word周围绘制边界框并将其保存在文件夹opencv python中

Posted

技术标签:

【中文标题】如何在Word周围绘制边界框并将其保存在文件夹opencv python中【英文标题】:How to draw bounding box around the Word and save it in folder opencv python 【发布时间】:2021-11-03 06:24:43 【问题描述】:

我正在关注 https://github.com/mindee/doctr 这个 GitHub 存储库来检测文本 我已将文本坐标转换为 [xmin, ymin, xmax, ymax] 中的绝对坐标。我想使用这些值绘制边界框并裁剪文件夹中的图像 我该怎么做呢

import json

from doctr.io import DocumentFile

from doctr.models import ocr_predictor



model = ocr_predictor(pretrained=True)

# PDF

doc = DocumentFile.from_images("/content/passbook_64_0.jpeg")

# Analyze

result = model(doc)

# Export results in json

with open("/content/preds.json", "w") as f:

    json.dump(result.export(), f)

export = result.export()
# Flatten the export
page_words = [[word for block in page['blocks'] for line in block['lines'] for word in line['words']] for page in export['pages']]
page_dims = [page['dimensions'] for page in export['pages']]
# Get the coords in [xmin, ymin, xmax, ymax]
words_abs_coords = [
    [[int(round(word['geometry'][0][0] * dims[0])), int(round(word['geometry'][0][1] * dims[1])), int(round(word['geometry'][1][0] * dims[0])), int(round(word['geometry'][1][1] * dims[1]))] for word in words]
    for words, dims in zip(page_words, page_dims)

]
print(words_abs_coords)

上面代码得到的绝对坐标值

[[[33, 108, 57, 135], [54, 107, 81, 136], [189, 110, 205, 141], [205, 112, 221, 141], [222, 114, 230, 141], [230, 112, 247, 141], [11, 173, 39, 196], [41, 175, 68, 196], [71, 175, 87, 198], [90, 177, 116, 198], [215, 179, 256, 199], [26, 204, 35, 225], [10, 203, 25, 227], [89, 204, 131, 228], [214, 207, 256, 227], [54, 228, 57, 236], [11, 225, 38, 246], [41, 224, 53, 247], [90, 225, 129, 245], [11, 244, 42, 265], [45, 245, 64, 267], [82, 246, 102, 267], [67, 246, 79, 268], [104, 247, 127, 268], [13, 301, 87, 324], [90, 303, 113, 323], [12, 327, 60, 349], [63, 331, 69, 347], [84, 331, 125, 349], [70, 328, 80, 351], [214, 334, 259, 356], [61, 360, 108, 378], [41, 357, 59, 382], [130, 360, 160, 381], [111, 359, 128, 382], [214, 362, 282, 386], [41, 388, 62, 411], [63, 388, 84, 411], [85, 388, 106, 411], [108, 387, 131, 410], [213, 392, 237, 415], [239, 393, 276, 418], [11, 415, 34, 439], [213, 419, 230, 444], [231, 419, 241, 444], [244, 422, 286, 447], [11, 443, 34, 467], [208, 441, 252, 477], [259, 451, 287, 476], [11, 474, 34, 497], [52, 471, 80, 496], [38, 470, 51, 498], [215, 478, 274, 501], [10, 501, 30, 525], [207, 505, 267, 531], [49, 531, 123, 555], [11, 536, 27, 559], [29, 534, 46, 562], [204, 536, 233, 560], [234, 538, 259, 562]]]
import matplotlib.pyplot as plt
import cv2
image = cv2.imread("/content/passbook_82_0.jpeg")
im_height, im_width, _ = image.shape
xmin=words_abs_coords[0][0][0]
ymin=words_abs_coords[0][0][1]
xmax=words_abs_coords[0][0][2]
ymax=words_abs_coords[0][0][3]
image1 = cv2.rectangle(image, (xmin,ymin), (xmax,ymax), (0,255,0), 2)
plt.imshow(image1)

【问题讨论】:

您已经知道how to draw rectangles,因此您只需找到询问如何从图像中裁剪子区域的问题。您应该搜索寻找答案。表现出一些努力。 这能回答你的问题吗? How to crop an image in OpenCV using Python @ChristophRackwitz 但在我的情况下,坐标有 [xmin, ymin, xmax, ymax],我绘制了矩形但它没有给出输出 “但是”?你还没有展示你是如何绘制任何东西的。你在等别人为你做你的工作吗? @ChristophRackwitz 请立即检查问题 【参考方案1】:

对于任何找到这个帖子的人,我相信答案已经在专门的 GitHub 讨论中提供:https://github.com/mindee/doctr/discussions/570

我认为在 sn-p 中唯一改变的部分是:

words_abs_coords = [
[[int(round(word['geometry'][0][0] * dims[0])), int(round(word['geometry'][0][1] * dims[1])), int(round(word['geometry'][1][0] * dims[0])), int(round(word['geometry'][1][1] * dims[1]))] for word in words]
for words, dims in zip(page_words, page_dims)

]

页面尺寸顺序使用错误,如讨论中所指出的,将其更改为:

words_abs_coords = [
[[int(round(word['geometry'][0][0] * dims[1])), int(round(word['geometry'][0][1] * dims[0])), int(round(word['geometry'][1][0] * dims[1])), int(round(word['geometry'][1][1] * dims[0]))] for word in words]
for words, dims in zip(page_words, page_dims)

]

应该可以解决你的问题:)

干杯!

【讨论】:

以上是关于如何在Word周围绘制边界框并将其保存在文件夹opencv python中的主要内容,如果未能解决你的问题,请参考以下文章

在轮廓JavaCV周围绘制边界框?

Python - opencv 在 Canny 边缘图像周围绘制边界框

在光流路径场周围创建一个边界框

将 html 输入框限制为其周围边框的最大宽度

Python:检测到运动周围的重叠框

如何使用 ggplot 在 R 中自动绘制图形并将它们保存到文件夹中?