Tesseract,openCV,python:如何获取句子或同一行文本的边界框?
Posted
技术标签:
【中文标题】Tesseract,openCV,python:如何获取句子或同一行文本的边界框?【英文标题】:Tesseract, openCV, python: how to get bounding box for a sentence or same line of text? 【发布时间】:2021-12-05 09:27:10 【问题描述】:我想对图像进行一些文本识别。我可以识别文本和相应的边界框,但只能逐字识别,我想在同一行文本上做同样的事情。在下面的代码中,我注意到当我显示边界框坐标时,当单词在同一行时,b['top'] 的值是相似的。我不知道我是否可以使用它,但我希望每行文本和相关句子都有一个边界框。
在我制作的代码下方:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import cv2
import pytesseract
from pytesseract import Output
pytesseract.pytesseract.tesseract_cmd = 'C:\\Program Files\\Tesseract-OCR\\tesseract.exe'
img = cv2.imread('./images/page_2.jpg') # load img
img = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY) #transform colored img to grayscale
plt.imshow(img)
boxes = pytesseract.image_to_data(img, output_type=Output.DICT) #transform image to dict
boxes = pd.DataFrame(boxes) #dict to dataframe
boxes['text'].replace('', np.nan, inplace=True) #replace empty values by NaN
boxes= boxes.dropna(subset = ['text']) #delete rows with NaN
print(boxes)
for index, b in boxes.iterrows():
(x,y,w,h) = b['left'],b['top'],b['width'],b['height']
print((x,y,w,h), b['text'])
cv2.rectangle(img,(x,y),(w+x,h+y), (0,0,255),1)
cv2.imshow('result',img)
cv2.waitKey(0)
“盒子”字典的输出:
level page_num block_num par_num line_num word_num left top \
4 5 1 1 1 1 1 32 24
5 5 1 1 1 1 2 100 24
6 5 1 1 1 1 3 191 28
7 5 1 1 1 1 4 227 28
8 5 1 1 1 1 5 257 24
.. ... ... ... ... ... ... ... ...
154 5 1 1 11 1 7 261 457
155 5 1 1 11 1 8 320 461
156 5 1 1 11 1 9 351 457
157 5 1 1 11 1 10 376 457
158 5 1 1 11 1 11 468 457
width height conf text
4 60 17 93.283920 Maitre
5 82 19 93.204414 corbeau,
6 29 13 96.932060 sur
7 22 12 96.932060 un
8 50 17 93.306122 arbre
.. ... ... ... ...
154 51 21 79.999794 qu'on
155 23 13 90.411606 ne
156 18 21 21.623993 I'y
157 85 21 90.583260 prendrait
158 44 21 96.933327 plus.
(x,y,w,h) 和 b['text'] 的输出(带有文本的边界框):
(32, 24, 60, 17) Maitre
(100, 24, 82, 19) corbeau,
(191, 28, 29, 13) sur
(227, 28, 22, 12) un
(257, 24, 50, 17) arbre
(315, 24, 70, 21) perché,
(79, 49, 58, 17) Tenait
(144, 53, 23, 13) en
(174, 53, 34, 13) son
(216, 50, 33, 16) bec
(257, 53, 22, 13) un
(287, 49, 84, 22) fromage.
(32, 75, 60, 17) Maitre
(100, 75, 61, 17) renard
(169, 79, 31, 17) par
(206, 75, 64, 17) I'odeur
(277, 75, 68, 17) alléché
(353, 88, 3, 6) ,
(81, 101, 27, 16) Lui
(115, 101, 28, 16) tint
(151, 100, 11, 17) 4
(169, 104, 34, 17) peu
(211, 100, 42, 21) prés
(260, 104, 21, 13) ce
(289, 101, 76, 20) langage
(374, 105, 3, 12) :
(81, 126, 31, 16) «Et
(119, 126, 72, 21) bonjour
(199, 126, 88, 17) Monsieur
(294, 126, 22, 16) du
(324, 125, 87, 18) Corbeau.
(31, 151, 40, 17) Que
(78, 155, 46, 13) vous
(131, 151, 40, 17) 6tes
(177, 151, 32, 21) joli!
(217, 155, 35, 17) que
(260, 155, 44, 13) vous
(312, 155, 29, 13) me
(348, 151, 80, 17) semblez
(436, 151, 52, 17) beau!
(81, 176, 47, 18) Sans
(136, 177, 63, 19) mentir,
(207, 177, 15, 17) si
(229, 178, 48, 16) votre
(284, 181, 72, 17) ramage
(81, 202, 25, 17) Se
(114, 204, 79, 19) rapporte
(200, 202, 11, 17) a
(218, 204, 48, 15) votre
(273, 203, 87, 20) plumage,
(31, 228, 48, 17) Vous
(86, 227, 40, 18) étes
(134, 228, 15, 16) le
(157, 227, 63, 21) phénix
(227, 228, 34, 17) des
(269, 227, 51, 18) hétes
(327, 228, 23, 16) de
(358, 232, 33, 13) ces
(398, 228, 49, 17) bois»
(31, 253, 53, 17) Aces
(92, 255, 45, 15) mots
(145, 253, 15, 17) le
(167, 253, 78, 17) corbeau
(253, 257, 22, 13) ne
(283, 257, 22, 13) se
(312, 255, 40, 15) sent
(360, 257, 33, 17) pas
(400, 253, 23, 17) de
(429, 253, 40, 21) joie;
(81, 279, 19, 16) Et
(107, 283, 43, 16) pour
(157, 280, 74, 16) montrer
(238, 283, 22, 13) sa
(267, 279, 45, 16) belle
(319, 279, 43, 19) voix,
(33, 304, 8, 16) ll
(49, 308, 53, 13) ouvre
(110, 308, 22, 13) un
(140, 304, 47, 21) large
(195, 304, 33, 17) bec
(236, 304, 54, 17) laisse
(297, 305, 67, 16) tomber
(371, 308, 22, 13) sa
(400, 304, 53, 21) proie.
(32, 330, 23, 17) Le
(63, 330, 60, 16) renard
(131, 330, 38, 17) s'en
(177, 330, 48, 17) saisit
(232, 331, 17, 15) et
(256, 330, 28, 16) dit:
(291, 330, 49, 16) "Mon
(348, 330, 35, 16) bon
(391, 330, 92, 19) Monsieur,
(103, 355, 92, 21) Apprenez
(202, 359, 36, 17) que
(245, 356, 35, 16) tout
(287, 355, 67, 17) flatteur
(31, 381, 25, 16) Vit
(63, 385, 34, 12) aux
(104, 381, 71, 20) dépens
(181, 381, 24, 16) de
(212, 381, 43, 16) celui
(262, 381, 28, 20) qui
(298, 380, 79, 17) l'écoute:
(32, 406, 50, 17) Cette
(90, 406, 50, 21) lecon
(148, 407, 40, 16) vaut
(195, 406, 40, 17) bien
(243, 410, 22, 13) un
(273, 406, 79, 21) fromage
(359, 410, 45, 13) sans
(411, 406, 67, 17) doute."
(81, 432, 22, 16) Le
(110, 432, 77, 16) corbeau
(195, 432, 76, 16) honteux
(279, 433, 17, 15) et
(303, 432, 63, 16) confus
(31, 457, 42, 17) Jura
(81, 457, 44, 17) mais
(133, 461, 22, 13) un
(163, 461, 34, 17) peu
(205, 457, 36, 17) tard
(250, 470, 3, 6) ,
(261, 457, 51, 21) qu'on
(320, 461, 23, 13) ne
(351, 457, 18, 21) I'y
(376, 457, 85, 21) prendrait
(468, 457, 44, 21) plus.
图像结果:
result
【问题讨论】:
最好将文本输出作为文本而不是图像发布。 我不明白你的评论抱歉,代码在哪里? 我的意思不是代码,而是问题帖。文本图像(在本例中为 dict 和 带有文本的框)会阻碍人们复制数据以寻找解决问题的方法。比贴文还要好,贴出boxes.to_dict()
的输出。
完成,谢谢!
【参考方案1】:
我注意到当我显示我的边界框坐标时,当单词在同一行时,b['top'] 的值是相似的。我不知道我是否可以使用它,但我希望每行文本和相关句子都有一个边界框。
您完全可以使用它。这会通过聚合垂直重叠的框来生成线条:
def lineup(boxes):
linebox = None
for _, box in boxes.iterrows():
if linebox is None: linebox = box # first line begins
elif box.top <= linebox.top+linebox.height: # box in same line
linebox.top = min(linebox.top, box.top)
linebox.width = box.left+box.width-linebox.left
linebox.heigth = max(linebox.top+linebox.height, box.top+box.height)-linebox.top
linebox.text += ' '+box.text
else: # box in new line
yield linebox
linebox = box # new line begins
yield linebox # return last line
lineboxes = pd.DataFrame.from_records(lineup(boxes))
【讨论】:
以上是关于Tesseract,openCV,python:如何获取句子或同一行文本的边界框?的主要内容,如果未能解决你的问题,请参考以下文章
Python下实现Tesseract OCR训练字符库(OpenCV-python边缘检测代替jTessBoxEditor手动矫正)
python 使用Python,OpenCV和Tesseract OCR引擎使用10行代码绕过Captcha
Python+OpenCV+Tesseract实现OCR字符识别