windows10子系统wsl下使用tensorflow2.5.0调用gpu进行训练
Posted lyz_fish
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了windows10子系统wsl下使用tensorflow2.5.0调用gpu进行训练相关的知识,希望对你有一定的参考价值。
在windows中下载wsl
首先:点击控制面板
–>启用或关闭windows功能
==> 勾选图中的两个选项:适用于linux的windows子系统
以及虚拟机平台
在微软商店下载wsl 系统-Ubuntu 18.04 并升级到wsl2
因为子系统调用gpu只支持wsl2。先查看wsl版本:
wsl -l -v
- 输出如下:其中虚拟机的名称为
Ubuntu-18.04
。
PS C:\\Users\\Administrator> wsl -l -v
NAME STATE VERSION
* Ubuntu-18.04 Running 1
- 对wsl进行升级:
wsl --set-version Ubuntu-18.04 2
- 重新启动,之后输出结果应为:
PS C:\\Users\\Administrator> wsl -l -v
NAME STATE VERSION
* Ubuntu-18.04 Running 2
在windows主系统下,下载cuda驱动:
- 手动在官网上选择对应独显(e.g. RTX1080)的驱动:https://www.nvidia.com/Download/index.aspx?lang=en-us
- 检测是否安装成功:
nvidia-smi
安装gcc依赖项
sudo apt-get update
sudo apt-get upgrade
sudo apt install gcc
gcc –v
ls /usr/bin/gcc*
sudo apt-get install make
make -v
sudo apt install vim
根据tf程序的信号对应手动安装cuda:
- 使用tf-gpu== 2.5.0:对应要下载CUDA版本为11.2
- 下载对应的驱动:https://developer.nvidia.com/cuda-11.2.0-download-archive?target_os=Linux&target_arch=x86_64&target_distro=WSLUbuntu&target_version=20&target_type=runfilelocal
- 执行代码:
sudo wget https://developer.download.nvidia.com/compute/cuda/11.2.0/local_installers/cuda_11.2.0_460.27.04_linux.run
sudo sh cuda_11.2.0_460.27.04_linux.run
- 配置环境
sudo vim ~/.bashrc
- 在后面添加:
export PATH=/usr/local/cuda-11.2/bin/$PATH:+:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-11.2/lib64$LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH
- 生效
source ~/.bashrc
安装cudnn
- 下载并解压:
sudo wget https://developer.nvidia.com/compute/machine-learning/cudnn/secure/8.0.4/11.1_20200923/cudnn-11.1-linux-x64-v8.0.4.30.tgz
sudo tar zxvf ./cudnn-11.1-linux-x64-v8.0.4.30.tgz -C ./
- 拷贝到对应cuda驱动的特定位置:
cd cuda
sudo cp ./include/* /usr/local/cuda-11.2/include/
sudo cp ./lib/* /usr/local/cuda-11.2/lib64/
- 测试:
import tensorflow as tf
print(tf.test.is_gpu_available)
WARNING:tensorflow:From <ipython-input-3-fbc346170d61>:1: is_gpu_available (from tensorflow.python.framework.test_util) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.config.list_physical_devices('GPU')` instead.
2022-07-10 22:05:12.272233: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-07-10 22:05:12.292230: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcuda.so.1
2022-07-10 22:05:12.595858: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:923] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2022-07-10 22:05:12.595919: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: NVIDIA GeForce GTX 1080 computeCapability: 6.1
coreClock: 1.7335GHz coreCount: 20 deviceMemorySize: 8.00GiB deviceMemoryBandwidth: 298.32GiB/s
2022-07-10 22:05:12.595943: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2022-07-10 22:05:12.599380: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.11
2022-07-10 22:05:12.599510: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.11
2022-07-10 22:05:12.600670: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcufft.so.10
2022-07-10 22:05:12.601375: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcurand.so.10
2022-07-10 22:05:12.605063: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusolver.so.11
2022-07-10 22:05:12.606022: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusparse.so.11
2022-07-10 22:05:12.606149: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudnn.so.8
2022-07-10 22:05:12.606835: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:923] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2022-07-10 22:05:12.607424: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:923] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2022-07-10 22:05:12.607467: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2022-07-10 22:05:12.607501: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2022-07-10 22:07:05.377901: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2022-07-10 22:07:05.377952: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264] 0
2022-07-10 22:07:05.377962: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0: N
2022-07-10 22:07:05.379881: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:923] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2022-07-10 22:07:05.380483: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:923] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2022-07-10 22:07:05.380519: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1501] Could not identify NUMA node of platform GPU id 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2022-07-10 22:07:05.381023: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:923] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2022-07-10 22:07:05.381084: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/device:GPU:0 with 6585 MB memory) -> physical GPU (device: 0, name: NVIDIA GeForce GTX 1080, pci bus id: 0000:01:00.0, compute capability: 6.1)
Out[3]: True
因为报错,所以需要重新安装驱动cuda 11.2:https://developer.nvidia.com/compute/machine-learning/cudnn/secure/8.1.0.77/11.2_20210127/cudnn-11.2-linux-x64-v8.1.0.77.tgz
cd cuda
sudo cp ./include/* /usr/local/cuda-11.2/include/
sudo cp ./lib64/* /usr/local/cuda-11.2/lib64/
参考[^1]: https://www.jianshu.com/p/be669d9359e2
以上是关于windows10子系统wsl下使用tensorflow2.5.0调用gpu进行训练的主要内容,如果未能解决你的问题,请参考以下文章
Windows-Windows下使用Linux系统(WSL)
G++-7windows10下Ubuntu子系统(WSL)安装g++及初步使用,docker使用g++