Error Code 1: Cuda Runtime (invalid resource handle)

Posted AI浩

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Error Code 1: Cuda Runtime (invalid resource handle)相关的知识,希望对你有一定的参考价值。

问题描述

同时加载了多个TensorRT模型,就会出现如下问题:

[12/06/2022-14:28:23] [TRT] [I] Loaded engine size: 5 MiB
[12/06/2022-14:28:23] [TRT] [I] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +3, now: CPU 0, GPU 3 (MiB)
[12/06/2022-14:28:23] [TRT] [I] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +5, now: CPU 0, GPU 8 (MiB)
[12/06/2022-14:28:23] [TRT] [W] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage. See `CUDA_MODULE_LOADING` in https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars
[12/06/2022-14:28:25] [TRT] [W] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage. See `CUDA_MODULE_LOADING` in https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars
[12/06/2022-14:28:25] [TRT] [W] The getMaxBatchSize() function should not be used with an engine built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. This function will always return 1.
[12/06/2022-14:28:25] [TRT] [W] The getMaxBatchSize() function should not be used with an engine built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. This function will always return 1.
[12/06/2022-14:28:27] [TRT] [E] 1: [reformat.cpp::genericReformat::executeCutensor::388] Error Code 1: CuTensor (Internal cuTensor permutate execute failed)
[12/06/2022-14:28:27] [TRT] [E] 1: [checkMacros.cpp::nvinfer1::catchCudaError::202] Error Code 1: Cuda Runtime (invalid resource handle)

原因分析

这种问题一般多发于在多线程中使用tensorrt,或者在主线程中定义tensorrt的引擎,然后在回调线程利用该引擎进行推理的时候,就会发生这样的错误。

解决方法

导入cuda包,然后初始化。

import pycuda.driver as cuda0
cuda0.init()

在类初始化里面添加:

self.cfx = cuda0.Device(0).make_context()

在推理代码里面,再推理前加上 self.cfx.push(),在推理完成后,加上 self.cfx.pop()

 self.cfx.push()
 #推理代码

 self.context.execute_v2(list(self.binding_addrs.values()))

 self.cfx.pop()

以上是关于Error Code 1: Cuda Runtime (invalid resource handle)的主要内容,如果未能解决你的问题,请参考以下文章

CUDA中使用多维数组

为啥 Tensorflow 1.11.0 返回 CUDA_ERROR_NOT_SUPPORTED?

Cuda 错误 CUDA_ERROR_NO_BINARY_FOR_GPU

使用不匹配的选项“-arch=compute_20 -code=sm_20”为 GeForce 310(计算能力 1.2)编译 CUDA 程序

Java异常

事件状态轮询错误:查询事件失败:CUDA_ERROR_LAUNCH_TIMEOUT