windows10子系统wsl下使用tensorflow2.5.0调用gpu进行训练

Posted 2022-07-12 lyz_fish

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了windows10子系统wsl下使用tensorflow2.5.0调用gpu进行训练相关的知识，希望对你有一定的参考价值。

在windows中下载wsl

首先：点击控制面板–>启用或关闭windows功能 ==> 勾选图中的两个选项:适用于linux的windows子系统以及虚拟机平台

在微软商店下载wsl 系统-Ubuntu 18.04 并升级到wsl2

因为子系统调用gpu只支持wsl2。先查看wsl版本：

wsl -l -v

输出如下：其中虚拟机的名称为Ubuntu-18.04。

PS C:\\Users\\Administrator> wsl -l -v
  NAME            STATE           VERSION
* Ubuntu-18.04    Running         1

对wsl进行升级：

wsl --set-version Ubuntu-18.04 2

重新启动，之后输出结果应为：

PS C:\\Users\\Administrator> wsl -l -v
  NAME            STATE           VERSION
* Ubuntu-18.04    Running         2

在windows主系统下，下载cuda驱动：

手动在官网上选择对应独显（e.g. RTX1080）的驱动：https://www.nvidia.com/Download/index.aspx?lang=en-us

检测是否安装成功：

nvidia-smi

安装gcc依赖项

sudo apt-get update
sudo apt-get upgrade
sudo apt install gcc
gcc –v
ls /usr/bin/gcc*
sudo apt-get install make
make -v
sudo apt install vim

根据tf程序的信号对应手动安装cuda：

使用tf-gpu== 2.5.0：对应要下载CUDA版本为11.2
下载对应的驱动：https://developer.nvidia.com/cuda-11.2.0-download-archive?target_os=Linux&target_arch=x86_64&target_distro=WSLUbuntu&target_version=20&target_type=runfilelocal
执行代码：

sudo wget https://developer.download.nvidia.com/compute/cuda/11.2.0/local_installers/cuda_11.2.0_460.27.04_linux.run
sudo sh cuda_11.2.0_460.27.04_linux.run

配置环境

sudo vim ~/.bashrc

在后面添加：

export PATH=/usr/local/cuda-11.2/bin/$PATH:+:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-11.2/lib64$LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH

生效

source ~/.bashrc

安装cudnn

下载并解压：

sudo wget https://developer.nvidia.com/compute/machine-learning/cudnn/secure/8.0.4/11.1_20200923/cudnn-11.1-linux-x64-v8.0.4.30.tgz
sudo tar zxvf ./cudnn-11.1-linux-x64-v8.0.4.30.tgz -C ./

拷贝到对应cuda驱动的特定位置：

cd cuda
sudo cp ./include/* /usr/local/cuda-11.2/include/
sudo cp ./lib/* /usr/local/cuda-11.2/lib64/

测试：

import tensorflow as tf
print(tf.test.is_gpu_available)

WARNING:tensorflow:From <ipython-input-3-fbc346170d61>:1: is_gpu_available (from tensorflow.python.framework.test_util) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.config.list_physical_devices('GPU')` instead.
2022-07-10 22:05:12.272233: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-07-10 22:05:12.292230: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcuda.so.1
2022-07-10 22:05:12.595858: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:923] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2022-07-10 22:05:12.595919: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: NVIDIA GeForce GTX 1080 computeCapability: 6.1
coreClock: 1.7335GHz coreCount: 20 deviceMemorySize: 8.00GiB deviceMemoryBandwidth: 298.32GiB/s
2022-07-10 22:05:12.595943: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2022-07-10 22:05:12.599380: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.11
2022-07-10 22:05:12.599510: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.11
2022-07-10 22:05:12.600670: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcufft.so.10
2022-07-10 22:05:12.601375: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcurand.so.10
2022-07-10 22:05:12.605063: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusolver.so.11
2022-07-10 22:05:12.606022: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusparse.so.11
2022-07-10 22:05:12.606149: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudnn.so.8
2022-07-10 22:05:12.606835: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:923] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2022-07-10 22:05:12.607424: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:923] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2022-07-10 22:05:12.607467: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2022-07-10 22:05:12.607501: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2022-07-10 22:07:05.377901: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2022-07-10 22:07:05.377952: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0
2022-07-10 22:07:05.377962: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N
2022-07-10 22:07:05.379881: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:923] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2022-07-10 22:07:05.380483: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:923] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2022-07-10 22:07:05.380519: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1501] Could not identify NUMA node of platform GPU id 0, defaulting to 0.  Your kernel may not have been built with NUMA support.
2022-07-10 22:07:05.381023: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:923] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2022-07-10 22:07:05.381084: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/device:GPU:0 with 6585 MB memory) -> physical GPU (device: 0, name: NVIDIA GeForce GTX 1080, pci bus id: 0000:01:00.0, compute capability: 6.1)
Out[3]: True

参考[^1]: https://www.jianshu.com/p/be669d9359e2

以上是关于windows10子系统wsl下使用tensorflow2.5.0调用gpu进行训练的主要内容，如果未能解决你的问题，请参考以下文章