Tensorflow - 通过跟踪边界框并输出它们从左到右显示排序的预测

Posted

技术标签:

【中文标题】Tensorflow - 通过跟踪边界框并输出它们从左到右显示排序的预测【英文标题】:Tensorflow - Showing sorted predictions from left to right by tracking bounding boxes and outputting them 【发布时间】:2022-01-12 17:15:33 【问题描述】:

我正在尝试使用 tf 2.0 进行预测,我设法训练我的模型并通过在其上打印边界框来显示图像上的输出,但我正在努力将预测输出作为输出“从左到右”的顺序。

我认为在边界框上工作会更容易,所以我将 xmin 坐标放在一个 numpy 数组中,并尝试将 xmin_arr 内容与 box[0] 匹配(在 for:

i=0
  for box in b:
    print ("This box with xmin", box[0], "is gonna get used, Detected class:", category_index[ output_dict['detection_classes'][i]])
    i+=1

) 但我认为这不是最好的方法,而且也不正确。

这是我到目前为止所做的:

def show_inference_and_prediction(model, image_np):
  # printing img height and width, I don't use it in this function
  # but I used it before to get xmin coords.
  height, width, _ = img.shape
  print ("IMG Height:", height, "IMG Width", width)
   # Actual detection.
  output_dict = run_inference_for_single_image(model, image_np)

  #get the detected class sorted by detection scores
  indexes = [i for i,k in enumerate(output_dict['detection_scores']) if (k > 0.8)]
  class_id = itemgetter(*indexes)(output_dict['detection_classes'])
  class_names = []
  for i in range(0, len(indexes)):
    class_names.append(category_index[class_id[i]]['name'])
  print("Detected classes:", class_names,"\n\n")
 
 
  boxes = output_dict['detection_boxes']
  # get all boxes from an array
  max_boxes_to_draw = boxes.shape[0]
  # get scores to get a threshold
  scores = output_dict['detection_scores']
  # threshold
  min_score_thresh=0.8
  xmin_arr=[]
  
  # iterate over all objects found
  for i in range(min(max_boxes_to_draw, boxes.shape[0])):
    # 
    if scores is None or scores[i] > min_score_thresh:
        xmin = output_dict['detection_boxes'][i][0]
        class_name = category_index[output_dict['detection_classes'][i]]['name']
        print ("This box is gonna get used", boxes[i][0], output_dict['detection_classes'][i])
        
        #print(ymin, xmin, ymax, xmax)
        xmin_arr.append(xmin)
      
  print("Non sorted xmin_arr",xmin_arr)
  xmin_arr.sort()
  print("Sorted xmin_arr", xmin_arr, "\n\n")

  boxes_test = np.squeeze(output_dict['detection_boxes'])
  scores_test = np.squeeze(output_dict['detection_scores'])
  bboxes = boxes[scores_test > min_score_thresh]
  print("Non sorted numpy array")
  print(bboxes,"\n\n")
  


  print("Sorted numpy array by xmin")
  ind=np.argsort(bboxes[:,0])
  b=bboxes[ind]
  print(b,"\n\n")
  #I know this isn't the best way for a for loop... i'm just new to python
  i=0
  for box in b:
    print ("This box with xmin", box[0], "is gonna get used, Detected class:", category_index[ output_dict['detection_classes'][i]])
    i+=1
          

这是输出

IMG Height: 100 IMG Width 220
Detected classes: ['6', '0', '6', '5', '0', '+'] 


This box is gonna get used 0.15368861 6
This box is gonna get used 0.25094065 10
This box is gonna get used 0.5650149 6
This box is gonna get used 0.53073287 5
This box is gonna get used 0.21016338 10
This box is gonna get used 0.48348305 11
Non sorted xmin_arr [0.15368861, 0.25094065, 0.5650149, 0.53073287, 0.21016338, 0.48348305]
Sorted xmin_arr [0.15368861, 0.21016338, 0.25094065, 0.48348305, 0.53073287, 0.5650149] 


Non sorted numpy array
[[0.15368861 0.00103605 0.4914853  0.14996211]
 [0.25094065 0.24868643 0.6210675  0.4069612 ]
 [0.5650149  0.81631124 0.9563305  0.9875988 ]
 [0.53073287 0.6841933  0.9102581  0.82026345]
 [0.21016338 0.1524337  0.5577521  0.27355438]
 [0.48348305 0.46985003 0.7388715  0.5943037 ]] 


Sorted numpy array by xmin
[[0.15368861 0.00103605 0.4914853  0.14996211]
 [0.21016338 0.1524337  0.5577521  0.27355438]
 [0.25094065 0.24868643 0.6210675  0.4069612 ]
 [0.48348305 0.46985003 0.7388715  0.5943037 ]
 [0.53073287 0.6841933  0.9102581  0.82026345]
 [0.5650149  0.81631124 0.9563305  0.9875988 ]] 


This box with xmin 0.15368861 is gonna get used, Detected class: 'id': 6, 'name': '6'
This box with xmin 0.21016338 is gonna get used, Detected class: 'id': 10, 'name': '0'
This box with xmin 0.25094065 is gonna get used, Detected class: 'id': 6, 'name': '6'
This box with xmin 0.48348305 is gonna get used, Detected class: 'id': 5, 'name': '5'
This box with xmin 0.53073287 is gonna get used, Detected class: 'id': 10, 'name': '0'
This box with xmin 0.5650149 is gonna get used, Detected class: 'id': 11, 'name': '+'

问题是:输入图像显示:606+56(这也是我想从输出中得到的)。 并且对类的预测是正确的,只是没有排序。 我认为错误在于:

print ("This box with xmin", box[0], "is gonna get used, Detected class:", category_index[ output_dict['detection_classes'][i]])

因为它只适用于旧索引...

我想得到:

This box with xmin 0.15368861 is gonna get used, Detected class:'id': 6, 'name': '6'
This box with xmin 0.21016338 is gonna get used, Detected class:'id': 10, 'name': '0'
This box with xmin 0.25094065 is gonna get used, Detected class:'id': 10, 'name': '0'
This box with xmin 0.48348305 is gonna get used, Detected class:'id': 11, 'name': '+'
This box with xmin 0.53073287 is gonna get used, Detected class:'id': 5, 'name': '5' 
This box with xmin 0.5650149 is gonna get used, Detected class:'id': 6, 'name': '6'

或者也可以:output: 600+56

如果有人能帮我解决这个问题,我会很高兴。 提前谢谢你。

编辑:

我发现了如何做到这一点,这是我想出的解决方案: 在对 numpy 数组进行排序后,我将索引保存在数组中,因此我只是重用了该索引。

class_names_id_sorted = []
class_id_detect_box = itemgetter(*ind)(output_dict['detection_classes'])
for i in range(0, len(ind)):
    class_names_id_sorted.append(category_index[class_id_detect_box[i]]['name'])
print("Detected classes:", class_names_id_sorted,"\n\n") 

【问题讨论】:

【参考方案1】:

免责声明:我没有尝试通读那个“问题”,而是从讨论它的 OpenCV discord 来到这里。

您想从左到右排序吗?为什么不按坐标对它们进行排序? X 从左到右增长,Y 从上到下增长。忽略 Y,只使用 X。类似于在 DarkPlate 中的操作:

https://github.com/stephanecharette/DarkPlate/blob/master/src/main.cpp#L57-L68

// sort the results from left-to-right based on the mid-x point of each detected object
std::sort(results.begin(), results.end(),
    [](const DarkHelp::PredictionResult & lhs, const DarkHelp::PredictionResult & rhs)
    
        // put the "license plate" class first so the characters are drawn overtop of this class
        if (lhs.best_class == class_plate)  return true;
        if (rhs.best_class == class_plate)  return false;

        // otherwise, sort by the horizontal coordinate
        // (this obviously only works with license plates that consist of a single row of characters)
        return lhs.original_point.x < rhs.original_point.x;
    );

这只有在你有一行字符时才有效,并且也只有在它们没有垂直堆叠时才有效。 (在这种情况下,您必须查看 Y 而不是 X。)

让我这样回答的是原始作者在 discord 上发布的带有注释的示例图像,这似乎已被排除在这个 SO 问题之外。

从这个角度来看,这与在车牌上排序字符完全相同的问题:https://github.com/stephanecharette/DarkPlate#darkplate

【讨论】:

以上是关于Tensorflow - 通过跟踪边界框并输出它们从左到右显示排序的预测的主要内容,如果未能解决你的问题,请参考以下文章

(Android) 使用地图标记创建边界框并在 Google Map V2 中获取其宽度

如何在同一个面积图中绘制两个数据框并通过深色和浅色区分它们?

Tensorflow 对象检测 api 获取按边界框坐标排序的预测

tensorflow中卷积层输出特征尺寸计算和padding参数解析

Tensorflow 对象检测 API 数据增强边界框

获得最大置信度的边界框 pandas opencv python