triton-inference-server报Error details: model expected the shape of dimension 0 to be between
Posted 修炼之路
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了triton-inference-server报Error details: model expected the shape of dimension 0 to be between相关的知识,希望对你有一定的参考价值。
错误信息描述
在使用perf_client
对模型的性能做基准测试的时候报了如下错误
./perf_client -m resnet152 -u 127.0.0.1:8001 -i grpc system --concurrency-range 4
*** Measurement Settings ***
Batch size: 1
Measurement window: 5000 msec
Using synchronous calls for inference
Stabilizing using average latency
Request concurrency: 4
Failed to maintain requested inference load. Worker thread(s) failed to generate concurrent requests.
Thread [0] had error: request specifies invalid shape for input 'input' for resnet152_0_2_gpu0. Error details: model expected the shape of dimension 0 to be between 8 and 8 but received 2
Thread [1] had error: request specifies invalid shape for input 'input' for resnet152_0_1_gpu0. Error details: model expected the shape of dimension 0 to be between 8 and 8 but received 1
Thread [2] had error: request specifies invalid shape for input 'input' for resnet152_0_0_gpu0. Error details: model expected the shape of dimension 0 to be between 8 and 8 but received 1
Thread [3] had error: request specifies invalid shape for input 'input' for resnet152_0_2_gpu0. Error details: model expected the shape of dimension 0 to be between 8 and 8 but received 2
错误原因分析
从错误信息中不难分析出模型期望的batch size是8到8而请求的时候batch size却是1
从而导致报错
- 分析模型的配置文件
找到模型目录下的config.pbtxt
文件,模型目录结构如下
models—resnet152----1----model.plan
------------------------------config.pbtxt
config.pbtxt内容如下
platform: "tensorrt_plan"
max_batch_size: 8
input [
{
name: "input"
data_type: TYPE_FP32
dims: [3,224,224]
}
]
output [
{
name: "features"
data_type: TYPE_FP32
dims: [ 2048,1,1 ]
}
]
dynamic_batching {
preferred_batch_size: [ 4, 8 ]
max_queue_delay_microseconds: 100
}
instance_group [
{
count: 8
kind: KIND_GPU
}
]
通过上面配置文件不难看出,我们已经设置了dynamic_batching
所以triton inference server
应该是能够接受动态的batch size,所以那就只能是模型不支持变换的batch size
解决办法
我是先将pytorch模型转换为onnx,然后再将onnx通过TensorRT转换为model.plan,在转换模型的时候需要设置可变的batch size
- 将pytorch转换为onnx
import torchvision
import torch,onnx
from torch import nn
from torch.autograd import Variable
from collections import OrderedDict
def export_onnx(model,image_shape,onnx_path, batch_size=8,dynamic_onnx=True):
input_name = ['input']
output_name = ['features']
x,y=image_shape
img = torch.zeros((batch_size, 3, x, y)).cuda()
if dynamic_onnx:
# 设置可变的image_height和image_width
# dynamic_axes = {'input' : {2 : 'image_height',3:'image_wdith'},
# 'features' : {2 : 'image_height',3:'image_wdith'}}
#设置可变的batch size
dynamic_axes = {'input': {0: 'batch_size'},
'features': {0: 'batch_size'}}
torch.onnx.export(model, (img), onnx_path,
input_names=input_name, output_names=output_name, verbose=True,
dynamic_axes=dynamic_axes)
else:
torch.onnx.export(model, (img), onnx_path,
input_names=input_name, output_names=output_name, verbose=True
)
def torch_to_onnx():
input_name = ['input']
output_name = ['features']
input = Variable(torch.randn(8, 3, 224, 224)).cuda()
model = torchvision.models.resnet152()
#我这里移除了最后一层全连接层
model_layers = list(model.children())[:-1]
new_model = torch.nn.Sequential(*model_layers)
def copyStateDict(state_dict):
if list(state_dict.keys())[0].startswith('module'):
start_idx = 1
else:
start_idx = 0
new_state_dict = OrderedDict()
for k, v in state_dict.items():
name = ','.join(k.split('.')[start_idx:])
new_state_dict[name] = v
return new_state_dict
state_dict = torch.load("models/resnet152-b121ed2d.pth")
new_state_dict = copyStateDict(state_dict)
keys = []
for k, _ in new_state_dict.items():
if k.startswith('fc'):
continue
keys.append(k)
new_dict = {k: new_state_dict[k] for k in keys}
new_model.state_dict().update(new_dict)
new_model.cuda()
export_onnx(new_model,(224,224),"models/resnet152.onnx")
- 将onnx转换为model.plan
trtexec --onnx=resnet152.onnx --saveEngine=model.plan --explicitBatch --minShapes=input:1x3x224x224 --optShapes=input:8x3x224x224 --maxShapes=input:8x3x224x224
以上是关于triton-inference-server报Error details: model expected the shape of dimension 0 to be between的主要内容,如果未能解决你的问题,请参考以下文章
triton-inference-server报Error details: model expected the shape of dimension 0 to be between