在使用 TensorRT 推理服务器时指定优化策略

Posted 2023-02-16

技术标签:

【中文标题】在使用 TensorRT 推理服务器时指定优化策略【英文标题】：Specifying Optimization Policy while using TensorRT Inference Server 【发布时间】：2019-04-09 15:21:17 【问题描述】：

我已成功使用 TensorRT 推理服务器为 Tensorflow 对象检测 API 模型提供服务，配置文件 (config.pbtxt) 如下：

name: "first_model"
platform: "tensorflow_savedmodel"
max_batch_size: 1
input [
  
    name: "inputs"
    data_type: TYPE_UINT8
    dims: [ -1, -1, 3 ]
  
]
output [
  
    name: "detection_boxes"
    data_type: TYPE_FP32
    dims: [ 100, 4 ]
  ,
  
    name: "detection_scores"
    data_type: TYPE_FP32
    dims: [ 100 ]
  ,
  
    name: "detection_classes"
    data_type: TYPE_FP32
    dims: [ 100 ]
  
]

我正在查看documentation，结果发现还有一种方法可以在config.pbtxt 中为模型指定optimization settings。然而，文档没有提到如何指定这些优化设置。我尝试将以下行添加到配置文件中

optimization_policy [
  
    level:1
  
]

并尝试为模型提供服务，但出现错误：Can't parse /models/first_model/config.pbtxt as text proto。但是，如果我删除与 optimization_policy 相关的行，我在服务时不会遇到任何问题。

如何在配置文件中指定优化策略/设置？

【问题讨论】：

【参考方案1】：

自己回答。通过在办公室 Github 存储库上提出问题得到答案。

您需要按照架构 here 使用 protobuf 文本格式格式化您的 config.pbtxt：

我相信你想要的是：

optimization   graph  level: 1

【讨论】：

以上是关于在使用 TensorRT 推理服务器时指定优化策略的主要内容，如果未能解决你的问题，请参考以下文章

TensorRT模型加速 | 网络结构优化 | 低精度推理

TensorRT 模型加速 2- 优化方式

如何优化用于 TensorRT 推理的 grid_sample 的自定义双线性采样替代方案？

TensorRT 与 TensorFlow 1.7 集成