triton-inference-server报Error details: model expected the shape of dimension 0 to be between

Posted 修炼之路

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了triton-inference-server报Error details: model expected the shape of dimension 0 to be between相关的知识,希望对你有一定的参考价值。

错误信息描述

在使用perf_client对模型的性能做基准测试的时候报了如下错误

./perf_client -m resnet152 -u 127.0.0.1:8001 -i grpc system --concurrency-range 4
*** Measurement Settings ***
  Batch size: 1
  Measurement window: 5000 msec
  Using synchronous calls for inference
  Stabilizing using average latency

Request concurrency: 4
Failed to maintain requested inference load. Worker thread(s) failed to generate concurrent requests.
Thread [0] had error: request specifies invalid shape for input 'input' for resnet152_0_2_gpu0. Error details: model expected the shape of dimension 0 to be between 8 and 8 but received 2
Thread [1] had error: request specifies invalid shape for input 'input' for resnet152_0_1_gpu0. Error details: model expected the shape of dimension 0 to be between 8 and 8 but received 1
Thread [2] had error: request specifies invalid shape for input 'input' for resnet152_0_0_gpu0. Error details: model expected the shape of dimension 0 to be between 8 and 8 but received 1
Thread [3] had error: request specifies invalid shape for input 'input' for resnet152_0_2_gpu0. Error details: model expected the shape of dimension 0 to be between 8 and 8 but received 2

错误原因分析

从错误信息中不难分析出模型期望的batch size是8到8而请求的时候batch size却是1从而导致报错

  • 分析模型的配置文件
    找到模型目录下的config.pbtxt文件,模型目录结构如下

models—resnet152----1----model.plan
------------------------------config.pbtxt

config.pbtxt内容如下


  platform: "tensorrt_plan"
  max_batch_size: 8
  input [
    {
      name: "input"
      data_type: TYPE_FP32
      dims: [3,224,224]
    }
  ]
  output [
    {
      name: "features"
      data_type: TYPE_FP32
      dims: [ 2048,1,1 ]
    }
  ]
  dynamic_batching {
    preferred_batch_size: [ 4, 8 ]
    max_queue_delay_microseconds: 100
  }
  instance_group [
    {
      count: 8
      kind: KIND_GPU
    }
  ]

通过上面配置文件不难看出,我们已经设置了dynamic_batching所以triton inference server应该是能够接受动态的batch size,所以那就只能是模型不支持变换的batch size

解决办法

我是先将pytorch模型转换为onnx,然后再将onnx通过TensorRT转换为model.plan,在转换模型的时候需要设置可变的batch size

  • 将pytorch转换为onnx
import torchvision
import torch,onnx
from torch import nn
from torch.autograd import Variable
from collections import OrderedDict

def export_onnx(model,image_shape,onnx_path, batch_size=8,dynamic_onnx=True):
    input_name = ['input']
    output_name = ['features']
    x,y=image_shape
    img = torch.zeros((batch_size, 3, x, y)).cuda()
    if dynamic_onnx:
    	# 设置可变的image_height和image_width
        # dynamic_axes = {'input' : {2 : 'image_height',3:'image_wdith'},
        #                         'features' : {2 : 'image_height',3:'image_wdith'}}
        #设置可变的batch size
        dynamic_axes = {'input': {0: 'batch_size'},
                        'features': {0: 'batch_size'}}
        torch.onnx.export(model, (img), onnx_path,
                          input_names=input_name, output_names=output_name, verbose=True,
                          dynamic_axes=dynamic_axes)
    else:
        torch.onnx.export(model, (img), onnx_path,
                          input_names=input_name, output_names=output_name, verbose=True
    )

def torch_to_onnx():
    input_name = ['input']
    output_name = ['features']
    input = Variable(torch.randn(8, 3, 224, 224)).cuda()
    model = torchvision.models.resnet152()
    #我这里移除了最后一层全连接层
    model_layers = list(model.children())[:-1]
    new_model = torch.nn.Sequential(*model_layers)

    def copyStateDict(state_dict):
        if list(state_dict.keys())[0].startswith('module'):
            start_idx = 1
        else:
            start_idx = 0
        new_state_dict = OrderedDict()
        for k, v in state_dict.items():
            name = ','.join(k.split('.')[start_idx:])
            new_state_dict[name] = v
        return new_state_dict

    state_dict = torch.load("models/resnet152-b121ed2d.pth")
    new_state_dict = copyStateDict(state_dict)
    keys = []
    for k, _ in new_state_dict.items():
        if k.startswith('fc'):
            continue
        keys.append(k)

    new_dict = {k: new_state_dict[k] for k in keys}
    new_model.state_dict().update(new_dict)
    new_model.cuda()
    export_onnx(new_model,(224,224),"models/resnet152.onnx")
  • 将onnx转换为model.plan
trtexec --onnx=resnet152.onnx --saveEngine=model.plan --explicitBatch --minShapes=input:1x3x224x224 --optShapes=input:8x3x224x224 --maxShapes=input:8x3x224x224

以上是关于triton-inference-server报Error details: model expected the shape of dimension 0 to be between的主要内容,如果未能解决你的问题,请参考以下文章

triton-inference-server报Error details: model expected the shape of dimension 0 to be between

webservice生成客户端文件报错

ISE软件报错

php error_reporting()关闭报错

solais 10中执行crontab -e报unkown terminal type

关于JS报错不是一个函数怎么解决