OpenCL无法使用OpenCV检测我的AMD GPU
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了OpenCL无法使用OpenCV检测我的AMD GPU相关的知识,希望对你有一定的参考价值。
我正在使用AMD Radeon R9 M375。我试着按照这个答案https://stackoverflow.com/a/34250412/8731839,但它对我不起作用。
这是我从clinfo.exe输出的
Platform Name: AMD Accelerated Parallel Processing
Number of devices: 2
Device Type: CL_DEVICE_TYPE_GPU
Vendor ID: 1002h
Board name: AMD Radeon (TM) R9 M375
Device Topology: PCI[ B#4, D#0, F#0 ]
Max compute units: 10
Max work items dimensions: 3
Max work items[0]: 256
Max work items[1]: 256
Max work items[2]: 256
Max work group size: 256
Preferred vector width char: 4
Preferred vector width short: 2
Preferred vector width int: 1
Preferred vector width long: 1
Preferred vector width float: 1
Preferred vector width double: 1
Native vector width char: 4
Native vector width short: 2
Native vector width int: 1
Native vector width long: 1
Native vector width float: 1
Native vector width double: 1
Max clock frequency: 1015Mhz
Address bits: 32
Max memory allocation: 3019898880
Image support: Yes
Max number of images read arguments: 128
Max number of images write arguments: 8
Max image 2D width: 16384
Max image 2D height: 16384
Max image 3D width: 2048
Max image 3D height: 2048
Max image 3D depth: 2048
Max samplers within kernel: 16
Max size of kernel argument: 1024
Alignment (bits) of base address: 2048
Minimum alignment (bytes) for any datatype: 128
Single precision floating point capability
Denorms: No
Quiet NaNs: Yes
Round to nearest even: Yes
Round to zero: Yes
Round to +ve and infinity: Yes
IEEE754-2008 fused multiply-add: Yes
Cache type: Read/Write
Cache line size: 64
Cache size: 16384
Global memory size: 3221225472
Constant buffer size: 65536
Max number of constant args: 8
Local memory type: Scratchpad
Local memory size: 32768
Max pipe arguments: 0
Max pipe active reservations: 0
Max pipe packet size: 0
Max global variable size: 0
Max global variable preferred total size: 0
Max read/write image args: 0
Max on device events: 0
Queue on device max size: 0
Max on device queues: 0
Queue on device preferred size: 0
SVM capabilities:
Coarse grain buffer: No
Fine grain buffer: No
Fine grain system: No
Atomics: No
Preferred platform atomic alignment: 0
Preferred global atomic alignment: 0
Preferred local atomic alignment: 0
Kernel Preferred work group size multiple: 64
Error correction support: 0
Unified memory for Host and Device: 0
Profiling timer resolution: 1
Device endianess: Little
Available: Yes
Compiler available: Yes
Execution capabilities:
Execute OpenCL kernels: Yes
Execute native function: No
Queue on Host properties:
Out-of-Order: No
Profiling : Yes
Queue on Device properties:
Out-of-Order: No
Profiling : No
Platform ID: 00007FFF209D0188
Name: Capeverde
Vendor: Advanced Micro Devices, Inc.
Device OpenCL C version: OpenCL C 1.2
Driver version: 2348.3
Profile: FULL_PROFILE
Version: OpenCL 1.2 AMD-APP (2348.3)
Extensions: cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics
cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing
cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_d3d10_sharing cl_khr_d3d11_sharing cl_khr_dx9_media_sharing
cl_khr_image2d_from_buffer cl_khr_spir cl_khr_gl_event cl_amd_liquid_flash
Device Type: CL_DEVICE_TYPE_CPU
Vendor ID: 1002h
Board name:
Max compute units: 4
Max work items dimensions: 3
Max work items[0]: 1024
Max work items[1]: 1024
Max work items[2]: 1024
Max work group size: 1024
Preferred vector width char: 16
Preferred vector width short: 8
Preferred vector width int: 4
Preferred vector width long: 2
Preferred vector width float: 8
Preferred vector width double: 4
Native vector width char: 16
Native vector width short: 8
Native vector width int: 4
Native vector width long: 2
Native vector width float: 8
Native vector width double: 4
Max clock frequency: 2200Mhz
Address bits: 64
Max memory allocation: 2147483648
Image support: Yes
Max number of images read arguments: 128
Max number of images write arguments: 64
Max image 2D width: 8192
Max image 2D height: 8192
Max image 3D width: 2048
Max image 3D height: 2048
Max image 3D depth: 2048
Max samplers within kernel: 16
Max size of kernel argument: 4096
Alignment (bits) of base address: 1024
Minimum alignment (bytes) for any datatype: 128
Single precision floating point capability
Denorms: Yes
Quiet NaNs: Yes
Round to nearest even: Yes
Round to zero: Yes
Round to +ve and infinity: Yes
IEEE754-2008 fused multiply-add: Yes
Cache type: Read/Write
Cache line size: 64
Cache size: 32768
Global memory size: 8499593216
Constant buffer size: 65536
Max number of constant args: 8
Local memory type: Global
Local memory size: 32768
Max pipe arguments: 16
Max pipe active reservations: 16
Max pipe packet size: 2147483648
Max global variable size: 1879048192
Max global variable preferred total size: 1879048192
Max read/write image args: 64
Max on device events: 0
Queue on device max size: 0
Max on device queues: 0
Queue on device preferred size: 0
SVM capabilities:
Coarse grain buffer: No
Fine grain buffer: No
Fine grain system: No
Atomics: No
Preferred platform atomic alignment: 0
Preferred global atomic alignment: 0
Preferred local atomic alignment: 0
Kernel Preferred work group size multiple: 1
Error correction support: 0
Unified memory for Host and Device: 1
Profiling timer resolution: 465
Device endianess: Little
Available: Yes
Compiler available: Yes
Execution capabilities:
Execute OpenCL kernels: Yes
Execute native function: Yes
Queue on Host properties:
Out-of-Order: No
Profiling : Yes
Queue on Device properties:
Out-of-Order: No
Profiling : No
Platform ID: 00007FFF209D0188
Name: Intel(R) Core(TM) i5-5200U CPU @ 2.20GHz
Vendor: GenuineIntel
Device OpenCL C version: OpenCL C 1.2
Driver version: 2348.3 (sse2,avx)
Profile: FULL_PROFILE
Version: OpenCL 1.2 AMD-APP (2348.3)
什么有效:
std::vector<cv::ocl::PlatformInfo> platforms;
cv::ocl::getPlatfomsInfo(platforms);
//OpenCL Platforms
for (size_t i = 0; i < platforms.size(); i++)
{
//Access to Platform
const cv::ocl::PlatformInfo* platform = &platforms[i];
//Platform Name
std::cout << "Platform Name: " << platform->name().c_str() << "
";
//Access Device within Platform
cv::ocl::Device current_device;
for (int j = 0; j < platform->deviceNumber(); j++)
{
//Access Device
platform->getDevice(current_device, j);
//Device Type
int deviceType = current_device.type();
cout << "Device Number: " << platform->deviceNumber() << endl;
cout << "Device Type: " << deviceType << endl;
}
}
上面的代码显示
Platform Name: Intel(R) OpenCL
Device Number: 2
Device Type: 2
Device Number: 2
Device Type: 4
Platform Name: AMD Accelerated Parallel Processing
Device Number: 2
Device Type: 4
Device Number: 2
Device Type: 2
如何使用AMD作为我的GPU来制作上下文?链接的帖子说使用方法initializeContextFromHandler
but OpenCV上的文档是不够的。 Documentation Link
答案
问题已解决。我不知道我做了什么,但AMD现在正在努力。
当前设置(在Windows上):
- 环境变量:
Name: OPENCV_OPENCL_DEVICE Value: AMD:GPU:Capeverde
- 使用
setUseOpenCL(bool foo)
中的ocl.hpp
来选择是使用GPU还是CPU。
最有可能的问题:在我的实际代码中,我没有进行任何计算,但是当我编写一个简单的代码来减去两个矩阵时,AMD就开始工作了。
码:
#include <opencv2/core/ocl.hpp>
#include <opencv2/opencv.hpp>
int main() {
cv::UMat mat1 = cv::UMat::ones(10, 10, CV_32F);
cv::UMat mat2 = cv::UMat::zeros(10, 10, CV_32F);
cv::UMat output = cv::UMat(10, 10, CV_32F);
cv::subtract(mat1, mat2, output);
std::cout << output << "
";
std::getchar();
}
以上是关于OpenCL无法使用OpenCV检测我的AMD GPU的主要内容,如果未能解决你的问题,请参考以下文章
使用 Nvidia 显卡安装 AMD OpenCL CPU 驱动程序