Error Code 1: Cuda Runtime (invalid resource handle)
Posted AI浩
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Error Code 1: Cuda Runtime (invalid resource handle)相关的知识,希望对你有一定的参考价值。
问题描述
同时加载了多个TensorRT模型,就会出现如下问题:
[12/06/2022-14:28:23] [TRT] [I] Loaded engine size: 5 MiB
[12/06/2022-14:28:23] [TRT] [I] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +3, now: CPU 0, GPU 3 (MiB)
[12/06/2022-14:28:23] [TRT] [I] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +5, now: CPU 0, GPU 8 (MiB)
[12/06/2022-14:28:23] [TRT] [W] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage. See `CUDA_MODULE_LOADING` in https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars
[12/06/2022-14:28:25] [TRT] [W] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage. See `CUDA_MODULE_LOADING` in https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars
[12/06/2022-14:28:25] [TRT] [W] The getMaxBatchSize() function should not be used with an engine built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. This function will always return 1.
[12/06/2022-14:28:25] [TRT] [W] The getMaxBatchSize() function should not be used with an engine built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. This function will always return 1.
[12/06/2022-14:28:27] [TRT] [E] 1: [reformat.cpp::genericReformat::executeCutensor::388] Error Code 1: CuTensor (Internal cuTensor permutate execute failed)
[12/06/2022-14:28:27] [TRT] [E] 1: [checkMacros.cpp::nvinfer1::catchCudaError::202] Error Code 1: Cuda Runtime (invalid resource handle)
原因分析
这种问题一般多发于在多线程中使用tensorrt,或者在主线程中定义tensorrt的引擎,然后在回调线程利用该引擎进行推理的时候,就会发生这样的错误。
解决方法
导入cuda包,然后初始化。
import pycuda.driver as cuda0
cuda0.init()
在类初始化里面添加:
self.cfx = cuda0.Device(0).make_context()
在推理代码里面,再推理前加上 self.cfx.push(),在推理完成后,加上 self.cfx.pop()
self.cfx.push()
#推理代码
self.context.execute_v2(list(self.binding_addrs.values()))
self.cfx.pop()
以上是关于Error Code 1: Cuda Runtime (invalid resource handle)的主要内容,如果未能解决你的问题,请参考以下文章
为啥 Tensorflow 1.11.0 返回 CUDA_ERROR_NOT_SUPPORTED?
Cuda 错误 CUDA_ERROR_NO_BINARY_FOR_GPU
使用不匹配的选项“-arch=compute_20 -code=sm_20”为 GeForce 310(计算能力 1.2)编译 CUDA 程序