triton server报The engine plan file is generated on an incompatible device
Posted 修炼之路
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了triton server报The engine plan file is generated on an incompatible device相关的知识,希望对你有一定的参考价值。
错误信息
在启动triton inference server
的时候报
I0701 02:42:42.028366 1 cuda_memory_manager.cc:103] CUDA memory pool is created on device 0 with size 67108864
I0701 02:42:42.031240 1 model_repository_manager.cc:1065] loading: resnet152:1
E0701 02:43:00.935893 1 logging.cc:43] INVALID_CONFIG: The engine plan file is generated on an incompatible device, expecting compute 7.5 got compute 8.6, please rebuild.
E0701 02:43:00.935952 1 logging.cc:43] engine.cpp (1646) - Serialization Error in deserialize: 0 (Core engine deserialization failure)
E0701 02:43:00.993150 1 logging.cc:43] INVALID_STATE: std::exception
E0701 02:43:00.993215 1 logging.cc:43] INVALID_CONFIG: Deserialize the cuda engine failed.
E0701 02:43:01.002146 1 model_repository_manager.cc:1242] failed to load 'resnet152' version 1: Internal: unable to create TensorRT engine
I0701 02:43:01.002473 1 server.cc:570]
+-----------+---------+---------------------------------------------------------+
| Model | Version | Status |
+-----------+---------+---------------------------------------------------------+
| resnet152 | 1 | UNAVAILABLE: Internal: unable to create TensorRT engine |
+-----------+---------+---------------------------------------------------------+
I0701 02:43:01.002665 1 server.cc:233] Waiting for in-flight requests to complete.
I0701 02:43:01.002678 1 server.cc:248] Timeout 30: Found 0 live models and 0 in-flight non-inference requests
error: creating server: Internal - failed to load all models
解决办法
从The engine plan file is generated on an incompatible device
不难看出是由于incompatible device导致的。
检查再将onnx
转换为model.plan
时的显卡型号是否和启动server时显卡型号一样。如果你是在RTX 3090
上转换的,启动的时候却使用的是RTX 2070
就会导致这个问题。解决办法就行,使用trtexec
在对应的显卡上重新生成model.plan
即可。
以上是关于triton server报The engine plan file is generated on an incompatible device的主要内容,如果未能解决你的问题,请参考以下文章
triton-inference-server启动报Invalid argument: unexpected inference
triton-inference-server启动报Internal - failed to load all models
深度学习部署架构:以 Triton Inference Server(TensorRT)为例
ERROR: Timeout on the Spark engine during the broadcast join