如何为 tensorflow 对象检测模型运行 eval.py 作业

Posted 2023-02-16

技术标签:

【中文标题】如何为 tensorflow 对象检测模型运行 eval.py 作业【英文标题】：How to run eval.py job for tensorflow object detection models 【发布时间】：2018-11-29 18:17:11 【问题描述】：

我在 Google Colab 上使用 tensorflow 的对象检测 API 训练了一个对象检测器。在互联网上研究了一天的大部分时间后，我一直无法找到有关如何为我的模型运行评估的教程，因此我可以获得 mAP 之类的指标。

我发现我必须使用 models/research/object_detection 文件夹中的 eval.py，但我不确定应该将哪些参数传递给脚本。

简而言之，到目前为止，我所做的是为测试和训练图像生成标签并将它们存储在 object_detection/images 文件夹下。我还生成了 train.record 和 test.record 文件，并编写了 labelmap.pbtxt 文件。我使用的是tensorflow模型动物园中的faster_rcnn_inception_v2_coco模型，所以我配置了faster_rcnn_inception_v2_coco.config文件，并将其存储在object_detection/training文件夹中。训练过程运行良好，所有检查点也存储在 object_detection/training 文件夹中。

现在我必须评估模型，我像这样运行 eval.py 脚本：

!python eval.py --logtostderr --pipeline_config_path=training/faster_rcnn_inception_v2_pets.config --checkpoint_dir=training/ --eval_dir=eval/

这样好吗？因为这开始运行良好，但是当我打开 tensorboard 时，只有两个选项卡，即图像和图形，但没有标量。另外，我使用 logdir=eval 运行 tensorboard。

我是 tensorflow 的新手，所以欢迎任何形式的帮助。谢谢。

【问题讨论】：

【参考方案1】：

设置看起来不错。我不得不等待很长时间才能让“标量”选项卡与其他两个选项卡一起加载/显示——比如在评估工作完成后 10 分钟。

但在评估作业结束时，它会在控制台中打印将在“标量”选项卡中显示的所有标量指标：

Accumulating evaluation results...
DONE (t=1.57s).
Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.434
Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.693
Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.470
Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000

等等

如果你想使用新的model_main.py 脚本而不是legacy/eval.py，你可以这样称呼它

python model_main.py --alsologtostderr --run_once --checkpoint_dir=/dir/with/checkpoint/at/one/timestamp --model_dir=eval/ --pipeline_config_path=training/faster_rcnn_inception_v2_pets.config

请注意，此新 API 将需要 train_config 中的 optimizer 字段，该字段可能已经在您的 pipeline.config 中，因为您在训练和评估中使用相同的字段。

【讨论】：

从哪里获得检测矩形？【参考方案2】：

仅适用于希望在评估模式下运行新的model_main.py 的用户。参数中有一个标志，您可以设置它来做到这一点。该标志为checkpoint_dir，如果将其设置为包含过去训练检查点的文件夹，则模型将仅在评估中运行。

希望我能帮助一些像我一样错过它的人！干杯，

【讨论】：

【参考方案3】：

我将尝试扩展和补充之前的答案。

如果您想根据验证数据评估您的模型，您应该使用：

python models/research/object_detection/model_main.py --pipeline_config_path=/path/to/pipeline_file --model_dir=/path/to/output_results --checkpoint_dir=/path/to/directory_holding_checkpoint --run_once=True

如果你想在训练数据上评估你的模型，你应该将 'eval_training_data' 设置为 True，即：

python models/research/object_detection/model_main.py --pipeline_config_path=/path/to/pipeline_file --model_dir=/path/to/output_results --eval_training_data=True --checkpoint_dir=/path/to/directory_holding_checkpoint --run_once=True

我还添加了 cmets 来阐明之前的一些选项：

--pipeline_config_path: 用于训练检测模型的“pipeline.config”文件的路径。此文件应包含您要评估的 TFRecords 文件（训练和测试文件）的路径，即：

    ...
    train_input_reader: 
        tf_record_input_reader 
                #path to the training TFRecord
                input_path: "/path/to/train.record"
        
        #path to the label map 
        label_map_path: "/path/to/label_map.pbtxt"
    
    ...
    eval_input_reader: 
        tf_record_input_reader 
            #path to the testing TFRecord
            input_path: "/path/to/test.record"
        
        #path to the label map 
        label_map_path: "/path/to/label_map.pbtxt"
    
    ...

--model_dir：将写入结果指标的输出目录，特别是 tensorboard 可以读取的“events.*”文件。

--checkpoint_dir：保存检查点的目录。这是在训练过程中或使用“export_inference_graph.py”导出后写入检查点文件（“model.ckpt.*”）的模型目录。

--run_once：正确只运行一轮评估。

【讨论】：

以上是关于如何为 tensorflow 对象检测模型运行 eval.py 作业的主要内容，如果未能解决你的问题，请参考以下文章