裁剪的边界框不正确

Posted 2023-02-25

技术标签:

【中文标题】裁剪的边界框不正确【英文标题】：Incorrect bounding boxes cropped 【发布时间】：2021-05-30 13:19:30 【问题描述】：

这是我在图像上可视化边界框的代码：

 viz_utils.visualize_boxes_and_labels_on_image_array(
  image_np_with_detections,
  detections['detection_boxes'][0].numpy(),
  (detections['detection_classes'][0].numpy() + label_id_offset).astype(int),
  detections['detection_scores'][0].numpy(),
  category_index,
  use_normalized_coordinates=True,
  max_boxes_to_draw=200,
  min_score_thresh=.5,
  agnostic_mode=False,

 )

这个是在检测后裁剪边界框：

width=600
height=900

boxes = detections['detection_boxes']
ymin = int((boxes[0][0][0]*height))
xmin = int((boxes[0][0][1]*width))
ymax = int((boxes[0][0][2]*height))
xmax = int((boxes[0][0][3]*width))
print ("xmin:  ".format(xmin),"ymin: ".format(ymin),"xmax: ".format(xmax),"ymax: ".format(ymax))

from PIL import Image
img = Image.open(image_path)
img2 = img.crop((xmin,xmax,ymin,ymax))
img2.save("/content/gdrive/MyDrive/UrduDetection/Croped_images/img8.jpg")

这不是正确的本地化裁剪。

如何获得检测到的边界框的正确裁剪图像？

【问题讨论】：

【参考方案1】：

PIL 的 crop() 函数不接受您提供的参数。您应该像 (left, top, right, bottom) 一样使用它，在您的情况下是：

img2 = img.crop((xmin,ymin,xmax,ymax))

或者您可以将图像打开为 numpy 数组并使用索引来裁剪它。

img  = numpy.asarray(PIL.Image.open('test.jpg'))
img2 = img[ymin:ymax, xmin:xmax, ...]

编辑

我不知道你的可视化函数里面有什么，所以我不知道它是如何工作的或不工作的。但只要看看你的情节数字，我就知道xmin-xmax 应该大致是200-620 而不是143-420。并且对于ymin-ymax，您正在使用的是510-560，而不是779-856。你给出了一个不存在的 y 范围，因此你的黑色输出。

您可能以错误的方式转换了可视化函数之外的坐标，您的坐标可能是 xc,yc,w,h 并且您将其视为 xmin,xmax,ymin,ymax 。

【讨论】：

img2 = img.crop((xmin,ymin,xmax,ymax)) 它给出了一个空白的黑色图像。我有一个图像的 numpy 数组，当我尝试 img2 = img[ymin:ymax, xmin:xmax]; img2.save("/content/gdrive/MyDrive/UrduDetection/Croped_images/img13.jpg") 它说 TypeError: 'JpegImageFile' object is not subscriptable。 @maryammehboob 它说“JpegImageFile”不可下标，因为它不是一个 numpy 数组。首先将其转换为 numpy 数组。 @maryammehboob 如果你得到一张空白的黑色图像，那么你的坐标是错误的。如果图像上出现的边界框是正确的，坐标怎么可能是错误的？

以上是关于裁剪的边界框不正确的主要内容，如果未能解决你的问题，请参考以下文章