windows10子系统wsl下使用tensorflow2.5.0调用gpu进行训练

Posted lyz_fish

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了windows10子系统wsl下使用tensorflow2.5.0调用gpu进行训练相关的知识,希望对你有一定的参考价值。

在windows中下载wsl

首先:点击控制面板–>启用或关闭windows功能 ==> 勾选图中的两个选项:适用于linux的windows子系统以及虚拟机平台

在微软商店下载wsl 系统-Ubuntu 18.04 并升级到wsl2

因为子系统调用gpu只支持wsl2。先查看wsl版本:

wsl -l -v
  • 输出如下:其中虚拟机的名称为Ubuntu-18.04
PS C:\\Users\\Administrator> wsl -l -v
  NAME            STATE           VERSION
* Ubuntu-18.04    Running         1
  • 对wsl进行升级:
wsl --set-version Ubuntu-18.04 2
  • 重新启动,之后输出结果应为:
PS C:\\Users\\Administrator> wsl -l -v
  NAME            STATE           VERSION
* Ubuntu-18.04    Running         2

在windows主系统下,下载cuda驱动:

  • 检测是否安装成功:
nvidia-smi

安装gcc依赖项

sudo apt-get update
sudo apt-get upgrade
sudo apt install gcc
gcc –v
ls /usr/bin/gcc*
sudo apt-get install make
make -v
sudo apt install vim 

根据tf程序的信号对应手动安装cuda:

sudo wget https://developer.download.nvidia.com/compute/cuda/11.2.0/local_installers/cuda_11.2.0_460.27.04_linux.run
sudo sh cuda_11.2.0_460.27.04_linux.run
  • 配置环境
sudo vim ~/.bashrc
  • 在后面添加:
export PATH=/usr/local/cuda-11.2/bin/$PATH:+:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-11.2/lib64$LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH
  • 生效
source ~/.bashrc

安装cudnn

  • 下载并解压:
sudo wget https://developer.nvidia.com/compute/machine-learning/cudnn/secure/8.0.4/11.1_20200923/cudnn-11.1-linux-x64-v8.0.4.30.tgz
sudo tar zxvf ./cudnn-11.1-linux-x64-v8.0.4.30.tgz -C ./
  • 拷贝到对应cuda驱动的特定位置:
cd cuda
sudo cp ./include/* /usr/local/cuda-11.2/include/
sudo cp ./lib/* /usr/local/cuda-11.2/lib64/
  • 测试:
import tensorflow as tf
print(tf.test.is_gpu_available)
WARNING:tensorflow:From <ipython-input-3-fbc346170d61>:1: is_gpu_available (from tensorflow.python.framework.test_util) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.config.list_physical_devices('GPU')` instead.
2022-07-10 22:05:12.272233: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-07-10 22:05:12.292230: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcuda.so.1
2022-07-10 22:05:12.595858: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:923] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2022-07-10 22:05:12.595919: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: NVIDIA GeForce GTX 1080 computeCapability: 6.1
coreClock: 1.7335GHz coreCount: 20 deviceMemorySize: 8.00GiB deviceMemoryBandwidth: 298.32GiB/s
2022-07-10 22:05:12.595943: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2022-07-10 22:05:12.599380: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.11
2022-07-10 22:05:12.599510: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.11
2022-07-10 22:05:12.600670: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcufft.so.10
2022-07-10 22:05:12.601375: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcurand.so.10
2022-07-10 22:05:12.605063: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusolver.so.11
2022-07-10 22:05:12.606022: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusparse.so.11
2022-07-10 22:05:12.606149: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudnn.so.8
2022-07-10 22:05:12.606835: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:923] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2022-07-10 22:05:12.607424: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:923] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2022-07-10 22:05:12.607467: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2022-07-10 22:05:12.607501: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2022-07-10 22:07:05.377901: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2022-07-10 22:07:05.377952: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0
2022-07-10 22:07:05.377962: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N
2022-07-10 22:07:05.379881: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:923] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2022-07-10 22:07:05.380483: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:923] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2022-07-10 22:07:05.380519: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1501] Could not identify NUMA node of platform GPU id 0, defaulting to 0.  Your kernel may not have been built with NUMA support.
2022-07-10 22:07:05.381023: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:923] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2022-07-10 22:07:05.381084: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/device:GPU:0 with 6585 MB memory) -> physical GPU (device: 0, name: NVIDIA GeForce GTX 1080, pci bus id: 0000:01:00.0, compute capability: 6.1)
Out[3]: True

参考[^1]: https://www.jianshu.com/p/be669d9359e2

以上是关于windows10子系统wsl下使用tensorflow2.5.0调用gpu进行训练的主要内容,如果未能解决你的问题,请参考以下文章

Window 10 WSL 下hadoop 伪分布式安装

Windows-Windows下使用Linux系统(WSL)

G++-7windows10下Ubuntu子系统(WSL)安装g++及初步使用,docker使用g++

G++-7windows10下Ubuntu子系统(WSL)安装g++及初步使用,docker使用g++

clion + wsl2 摆脱虚拟机!

Windows10 WSL下 龙芯3A 交叉编译环境搭建