使用 Cuda 10.2 构建 OpenCV 2.4xx 时出错

Posted

技术标签:

【中文标题】使用 Cuda 10.2 构建 OpenCV 2.4xx 时出错【英文标题】:Error in building OpenCV 2.4xx with Cuda 10.2 【发布时间】:2021-04-14 03:04:42 【问题描述】:

我正在尝试使用安装在 Jetson AGX Xavier 上的 Cuda-10.2 构建 OpenCV 2.4。我已关注this 博客文章以更改文件,以便 opencv 能够找到所有 cuda 库。

我正在运行以下命令来生成 cmake 缓存:

cmake -DCMAKE_INSTALL_PREFIX=~/lib/opencv_2.4/installed -DCMAKE_BUILD_TYPE="Release" -DWITH_CUDA=ON -DCUDA_GENERATION=Volta -D OPENCV_DNN_CUDA=ON -DCUDA_ARCH_BIN=7.5 -DCUDA_HOST_COMPILER=/usr/bin/gcc-8 -DCMAKE_C_COMPILER=gcc-8 -DCMAKE_CXX_COMPILER=g++-8 ..

我在 make 或 make -j8 时遇到以下错误

[ 56%] Linking CXX executable ../../bin/opencv_perf_photo
[ 56%] Built target opencv_perf_photo
[ 56%] Built target opencv_gpu_pch_dephelp
[ 57%] Built target pch_Generate_opencv_gpu
[ 58%] Building NVCC (Device) object modules/gpu/CMakeFiles/cuda_compile.dir/src/cuda/cuda_compile_generated_bf_knnmatch.cu.o
In file included from /home/nvidia/opencv-2.4/modules/gpu/include/opencv2/gpu/device/functional.hpp:50:0,
                 from /home/nvidia/opencv-2.4/modules/gpu/include/opencv2/gpu/device/vec_distance.hpp:47,
                 from /home/nvidia/opencv-2.4/modules/gpu/src/cuda/bf_knnmatch.cu:49:
/usr/local/cuda-10.2/include/device_functions.h:54:2: warning: #warning "device_functions.h is an internal header file and must not be used directly.  This file will be removed in a future CUDA release.  Please use cuda_runtime_api.h or cuda_runtime.h instead." [-Wcpp]
 #warning "device_functions.h is an internal header file and must not be used directly.  This file will be removed in a future CUDA release.  Please use cuda_runtime_api.h or cuda_runtime.h instead."
  ^~~~~~~
In file included from /home/nvidia/opencv-2.4/modules/gpu/include/opencv2/gpu/device/functional.hpp:50:0,
                 from /home/nvidia/opencv-2.4/modules/gpu/include/opencv2/gpu/device/vec_distance.hpp:47,
                 from /home/nvidia/opencv-2.4/modules/gpu/src/cuda/bf_knnmatch.cu:49:
/usr/local/cuda-10.2/include/device_functions.h:54:2: warning: #warning "device_functions.h is an internal header file and must not be used directly.  This file will be removed in a future CUDA release.  Please use cuda_runtime_api.h or cuda_runtime.h instead." [-Wcpp]
 #warning "device_functions.h is an internal header file and must not be used directly.  This file will be removed in a future CUDA release.  Please use cuda_runtime_api.h or cuda_runtime.h instead."
  ^~~~~~~
In file included from /home/nvidia/opencv-2.4/modules/gpu/include/opencv2/gpu/device/functional.hpp:50:0,
                 from /home/nvidia/opencv-2.4/modules/gpu/include/opencv2/gpu/device/vec_distance.hpp:47,
                 from /home/nvidia/opencv-2.4/modules/gpu/src/cuda/bf_knnmatch.cu:49:
/usr/local/cuda-10.2/include/device_functions.h:54:2: warning: #warning "device_functions.h is an internal header file and must not be used directly.  This file will be removed in a future CUDA release.  Please use cuda_runtime_api.h or cuda_runtime.h instead." [-Wcpp]
 #warning "device_functions.h is an internal header file and must not be used directly.  This file will be removed in a future CUDA release.  Please use cuda_runtime_api.h or cuda_runtime.h instead."
  ^~~~~~~
In file included from /home/nvidia/opencv-2.4/modules/gpu/include/opencv2/gpu/device/functional.hpp:50:0,
                 from /home/nvidia/opencv-2.4/modules/gpu/include/opencv2/gpu/device/vec_distance.hpp:47,
                 from /home/nvidia/opencv-2.4/modules/gpu/src/cuda/bf_knnmatch.cu:49:
/usr/local/cuda-10.2/include/device_functions.h:54:2: warning: #warning "device_functions.h is an internal header file and must not be used directly.  This file will be removed in a future CUDA release.  Please use cuda_runtime_api.h or cuda_runtime.h instead." [-Wcpp]
 #warning "device_functions.h is an internal header file and must not be used directly.  This file will be removed in a future CUDA release.  Please use cuda_runtime_api.h or cuda_runtime.h instead."
  ^~~~~~~
/home/nvidia/opencv-2.4/modules/gpu/include/opencv2/gpu/device/detail/../warp_shuffle.hpp(60): error: identifier "__shfl" is undefined

/home/nvidia/opencv-2.4/modules/gpu/include/opencv2/gpu/device/detail/../warp_shuffle.hpp(71): error: identifier "__shfl" is undefined

/home/nvidia/opencv-2.4/modules/gpu/include/opencv2/gpu/device/detail/../warp_shuffle.hpp(92): error: identifier "__shfl_down" is undefined

/home/nvidia/opencv-2.4/modules/gpu/include/opencv2/gpu/device/detail/../warp_shuffle.hpp(103): error: identifier "__shfl_down" is undefined

/home/nvidia/opencv-2.4/modules/gpu/include/opencv2/gpu/device/detail/../warp_shuffle.hpp(124): error: identifier "__shfl_up" is undefined

/home/nvidia/opencv-2.4/modules/gpu/include/opencv2/gpu/device/detail/../warp_shuffle.hpp(135): error: identifier "__shfl_up" is undefined

/home/nvidia/opencv-2.4/modules/gpu/include/opencv2/gpu/device/detail/../warp_shuffle.hpp(84): error: identifier "__shfl_down" is undefined
          detected during:
            instantiation of "T cv::gpu::device::shfl_down(T, unsigned int, int) [with T=float]" 
/home/nvidia/opencv-2.4/modules/gpu/src/cuda/bf_knnmatch.cu(75): here
            instantiation of "void cv::gpu::device::bf_knnmatch::findBestMatch<BLOCK_SIZE>(float &, float &, int &, int &, float *, int *) [with BLOCK_SIZE=16]" 
/home/nvidia/opencv-2.4/modules/gpu/src/cuda/bf_knnmatch.cu(401): here
            instantiation of "void cv::gpu::device::bf_knnmatch::matchUnrolledCached<BLOCK_SIZE,MAX_DESC_LEN,Dist,T,Mask>(cv::gpu::PtrStepSz<T>, cv::gpu::PtrStepSz<T>, Mask, int2 *, float2 *) [with BLOCK_SIZE=16, MAX_DESC_LEN=64, Dist=cv::gpu::device::L1Dist<cv::gpu::device::uchar>, T=unsigned char, Mask=cv::gpu::device::SingleMask]" 
/home/nvidia/opencv-2.4/modules/gpu/src/cuda/bf_knnmatch.cu(420): here
            instantiation of "void cv::gpu::device::bf_knnmatch::matchUnrolledCached<BLOCK_SIZE,MAX_DESC_LEN,Dist,T,Mask>(const cv::gpu::PtrStepSz<T> &, const cv::gpu::PtrStepSz<T> &, const Mask &, const cv::gpu::PtrStepSz<int2> &, const cv::gpu::PtrStepSz<float2> &, cudaStream_t) [with BLOCK_SIZE=16, MAX_DESC_LEN=64, Dist=cv::gpu::device::L1Dist<cv::gpu::device::uchar>, T=unsigned char, Mask=cv::gpu::device::SingleMask]" 
/home/nvidia/opencv-2.4/modules/gpu/src/cuda/bf_knnmatch.cu(852): here
            instantiation of "void cv::gpu::device::bf_knnmatch::match2Dispatcher<Dist,T,Mask>(const cv::gpu::PtrStepSz<T> &, const cv::gpu::PtrStepSz<T> &, const Mask &, const cv::gpu::PtrStepSzb &, const cv::gpu::PtrStepSzb &, cudaStream_t) [with Dist=cv::gpu::device::L1Dist<cv::gpu::device::uchar>, T=unsigned char, Mask=cv::gpu::device::SingleMask]" 
/home/nvidia/opencv-2.4/modules/gpu/src/cuda/bf_knnmatch.cu(1149): here
            instantiation of "void cv::gpu::device::bf_knnmatch::matchDispatcher<Dist,T,Mask>(const cv::gpu::PtrStepSz<T> &, const cv::gpu::PtrStepSz<T> &, int, const Mask &, const cv::gpu::PtrStepSzb &, const cv::gpu::PtrStepSzb &, const cv::gpu::PtrStepSzf &, cudaStream_t) [with Dist=cv::gpu::device::L1Dist<cv::gpu::device::uchar>, T=unsigned char, Mask=cv::gpu::device::SingleMask]" 
/home/nvidia/opencv-2.4/modules/gpu/src/cuda/bf_knnmatch.cu(1166): here
            instantiation of "void cv::gpu::device::bf_knnmatch::matchL1_gpu<T>(const cv::gpu::PtrStepSzb &, const cv::gpu::PtrStepSzb &, int, const cv::gpu::PtrStepSzb &, const cv::gpu::PtrStepSzb &, const cv::gpu::PtrStepSzb &, const cv::gpu::PtrStepSzf &, cudaStream_t) [with T=cv::gpu::device::uchar]" 
/home/nvidia/opencv-2.4/modules/gpu/src/cuda/bf_knnmatch.cu(1172): here

/home/nvidia/opencv-2.4/modules/gpu/include/opencv2/gpu/device/detail/../warp_shuffle.hpp(84): error: identifier "__shfl_down" is undefined
          detected during:
            instantiation of "T cv::gpu::device::shfl_down(T, unsigned int, int) [with T=int]" 
/home/nvidia/opencv-2.4/modules/gpu/src/cuda/bf_knnmatch.cu(77): here
            instantiation of "void cv::gpu::device::bf_knnmatch::findBestMatch<BLOCK_SIZE>(float &, float &, int &, int &, float *, int *) [with BLOCK_SIZE=16]" 
/home/nvidia/opencv-2.4/modules/gpu/src/cuda/bf_knnmatch.cu(401): here
            instantiation of "void cv::gpu::device::bf_knnmatch::matchUnrolledCached<BLOCK_SIZE,MAX_DESC_LEN,Dist,T,Mask>(cv::gpu::PtrStepSz<T>, cv::gpu::PtrStepSz<T>, Mask, int2 *, float2 *) [with BLOCK_SIZE=16, MAX_DESC_LEN=64, Dist=cv::gpu::device::L1Dist<cv::gpu::device::uchar>, T=unsigned char, Mask=cv::gpu::device::SingleMask]" 
/home/nvidia/opencv-2.4/modules/gpu/src/cuda/bf_knnmatch.cu(420): here
            instantiation of "void cv::gpu::device::bf_knnmatch::matchUnrolledCached<BLOCK_SIZE,MAX_DESC_LEN,Dist,T,Mask>(const cv::gpu::PtrStepSz<T> &, const cv::gpu::PtrStepSz<T> &, const Mask &, const cv::gpu::PtrStepSz<int2> &, const cv::gpu::PtrStepSz<float2> &, cudaStream_t) [with BLOCK_SIZE=16, MAX_DESC_LEN=64, Dist=cv::gpu::device::L1Dist<cv::gpu::device::uchar>, T=unsigned char, Mask=cv::gpu::device::SingleMask]" 
/home/nvidia/opencv-2.4/modules/gpu/src/cuda/bf_knnmatch.cu(852): here
            instantiation of "void cv::gpu::device::bf_knnmatch::match2Dispatcher<Dist,T,Mask>(const cv::gpu::PtrStepSz<T> &, const cv::gpu::PtrStepSz<T> &, const Mask &, const cv::gpu::PtrStepSzb &, const cv::gpu::PtrStepSzb &, cudaStream_t) [with Dist=cv::gpu::device::L1Dist<cv::gpu::device::uchar>, T=unsigned char, Mask=cv::gpu::device::SingleMask]" 
/home/nvidia/opencv-2.4/modules/gpu/src/cuda/bf_knnmatch.cu(1149): here
            instantiation of "void cv::gpu::device::bf_knnmatch::matchDispatcher<Dist,T,Mask>(const cv::gpu::PtrStepSz<T> &, const cv::gpu::PtrStepSz<T> &, int, const Mask &, const cv::gpu::PtrStepSzb &, const cv::gpu::PtrStepSzb &, const cv::gpu::PtrStepSzf &, cudaStream_t) [with Dist=cv::gpu::device::L1Dist<cv::gpu::device::uchar>, T=unsigned char, Mask=cv::gpu::device::SingleMask]" 
/home/nvidia/opencv-2.4/modules/gpu/src/cuda/bf_knnmatch.cu(1166): here
            instantiation of "void cv::gpu::device::bf_knnmatch::matchL1_gpu<T>(const cv::gpu::PtrStepSzb &, const cv::gpu::PtrStepSzb &, int, const cv::gpu::PtrStepSzb &, const cv::gpu::PtrStepSzb &, const cv::gpu::PtrStepSzb &, const cv::gpu::PtrStepSzf &, cudaStream_t) [with T=cv::gpu::device::uchar]" 
/home/nvidia/opencv-2.4/modules/gpu/src/cuda/bf_knnmatch.cu(1172): here

8 errors detected in the compilation of "/tmp/tmpxft_000021a9_00000000-7_bf_knnmatch.compute_72.cpp1.ii".
CMake Error at cuda_compile_generated_bf_knnmatch.cu.o.cmake:264 (message):
  Error generating file
  /home/nvidia/opencv-2.4/build/modules/gpu/CMakeFiles/cuda_compile.dir/src/cuda/./cuda_compile_generated_bf_knnmatch.cu.o


modules/gpu/CMakeFiles/opencv_gpu.dir/build.make:503: recipe for target 'modules/gpu/CMakeFiles/cuda_compile.dir/src/cuda/cuda_compile_generated_bf_knnmatch.cu.o' failed
make[2]: *** [modules/gpu/CMakeFiles/cuda_compile.dir/src/cuda/cuda_compile_generated_bf_knnmatch.cu.o] Error 1
CMakeFiles/Makefile2:4741: recipe for target 'modules/gpu/CMakeFiles/opencv_gpu.dir/all' failed
make[1]: *** [modules/gpu/CMakeFiles/opencv_gpu.dir/all] Error 2
Makefile:162: recipe for target 'all' failed
make: *** [all] Error 2 

我也尝试过使用 -DCUDA_ARCH_BIN=7.2。我犯了同样的错误。 我该如何解决这个错误?

【问题讨论】:

您尝试构建的 OpenCV 版本不支持您拥有的 CUDA 工具包版本。您要么需要使用较旧版本的 CUDA 工具包,要么需要使用较新版本的 OpenCV 源代码。 【参考方案1】:

OpenCV 2.4 不适用于 CUDA Toolkit 10.2。它最初是为 4.1 和 4.2 版本设计的。如果您可以降级到 CUDA 9,您可能能够编译和运行该代码。否则,您将不得不重写这些内核以删除不再支持且无法在您的 Volta GPU 上工作的已弃用指令的使用。

参考:OpenCv Compiling with Cuda

【讨论】:

请注意,他正在开发非常新的 Jetson ARM 嵌入式系统。 CUDA 4(或之前关于 CUDA 9 的任何版本)无法在他的系统上运行 @talonmies 感谢您的注意。经过进一步研究,OpenCV 2.4 可以通过一些黑客攻击与 CUDA 9 一起使用。我会尽快更新我的答案。 哇。谢谢@talonmies!

以上是关于使用 Cuda 10.2 构建 OpenCV 2.4xx 时出错的主要内容,如果未能解决你的问题,请参考以下文章

OpenCV配置cuda

vs2017+opencv+qt+cuda,使用cmake编译opencv的库

CUDA 11.2 是不是支持向后兼容在 CUDA 10.2 上编译的应用程序?

getCudaEnabledDeviceCount() 返回 -1:使用 vcpkg 构建的 OpenCV [cuda]

在 Windows 上使用 CUDA、TBB、MKL、VTK 构建 OpenCV

无法在 RHEL 7 上使用 CUDA 构建 OpenCV