Ubuntu16.04+cuda8.0+cuDNNV5.1 + Tensorflow+ GT 840M安装小结
Posted Image Process
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Ubuntu16.04+cuda8.0+cuDNNV5.1 + Tensorflow+ GT 840M安装小结相关的知识,希望对你有一定的参考价值。
最近重装系统,安装了tensorflow的配置环境
总结一下。
参考资料
http://blog.csdn.net/ZWX2445205419/article/details/69429518
http://blog.csdn.net/u013294888/article/details/56666023
http://www.2cto.com/kf/201612/578337.html
http://blog.csdn.net/10km/article/details/61915535
NVIDIA驱动安装方法
https://wiki.ubuntu.com.cn/NVIDIA
查询NVIDIA的驱动型号
http://www.nvidia.com/Download/index.aspx?lang=en-us
查询GPU是否支持CUDA
https://developer.nvidia.com/cuda-gpus
GeForce 840M 5.0
坚果云盘
https://www.jianguoyun.com/s/downloads/linux
第一步 安装NIVDIA驱动
0 关闭secure boot;这一步是最关键的,否则后面都无法安装!!!!
1 Nvidia显卡驱动信息
(1) 查看显卡的型号
首先安装显卡驱动。首先看自己显卡
lspci | grep -i vga
lspci | grep -i nvidia
然后看显卡驱动
lsmod | grep -i nvidia
#查看你的系统信息
uname -m && cat /etc/*release
# 查看核
uname -r
# 为当前核安装kernel headers和development packages
sudo apt-get install linux-headers-$(uname -r)
2、拉黑nouveau
ubuntu自带的nouveau驱动会影响cuda安装,不当操作会导致黑屏和登陆循环
终端中运行:
lsmod | grep nouveau
如果有输出则代表nouveau正在加载。
关闭方法1
2.1
创建/etc/modprobe.d/blacklist-nouveau.conf,写入:
blacklist nouveau
options nouveau modeset=0
关闭方法2
2.2 首先,禁用可能导致问题的开源驱动
编辑/etc/modprobe.d/blacklist.conf;
sudo gedit /etc/modprobe.d/blacklist.conf
添加一下内容:
blacklist vga16fb
blacklist nouveau
blacklist rivafb
blacklist nvidiafb
blacklist rivatv
3 卸载之前安装的Nvidia显卡驱动安装
sudo apt-get remove –purge nvidia-*
4 安装NVIDIA驱动
在ubuntu16.04中,更换驱动非常方便,去
系统设置->软件更新->附加驱动->切换到最新的NVIDIA驱动即可。应用更改->重启
nvidia-smi
如果出现了你的GPU列表,则说明驱动安装成功了。
另外也可以通过,或者输入
nvidia-settings
出现
安装驱动完成
第二部 安装CUDA 8.0
1 命令行安装.run文件
sudo sh cuda_8.0.61_375.26_linux.run
安装cuda时可能有下面的信息
Installing the CUDA Toolkit in /usr/local/cuda-8.0 …
Missing recommended library: libGLU.so
Missing recommended library: libX11.so
Missing recommended library: libXi.so
Missing recommended library: libXmu.so
sudo apt-get install freeglut3-dev build-essential libx11-dev libxmu-dev libxi-dev libgl1-mesa-glx libglu1-mesa libglu1-mesa-dev
Do you accept the previously read EULA?
accept/decline/quit: accept
Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 375.26?
(y)es/(n)o/(q)uit: n
Install the CUDA 8.0 Toolkit?
(y)es/(n)o/(q)uit: y
Enter Toolkit Location
[ default is /usr/local/cuda-8.0 ]:
Do you want to install a symbolic link at /usr/local/cuda?
(y)es/(n)o/(q)uit: y
Install the CUDA 8.0 Samples?
(y)es/(n)o/(q)uit: y
Enter CUDA Samples Location
[ default is /home/maddock ]:
Installing the CUDA Toolkit in /usr/local/cuda-8.0 ...
Installing the CUDA Samples in /home/maddock ...
Copying samples to /home/maddock/NVIDIA_CUDA-8.0_Samples now...
Finished copying samples.
===========
= Summary =
===========
Driver: Not Selected
Toolkit: Installed in /usr/local/cuda-8.0
Samples: Installed in /home/maddock
Please make sure that
- PATH includes /usr/local/cuda-8.0/bin
- LD_LIBRARY_PATH includes /usr/local/cuda-8.0/lib64, or, add /usr/local/cuda-8.0/lib64 to /etc/ld.so.conf and run ldconfig as root
To uninstall the CUDA Toolkit, run the uninstall script in /usr/local/cuda-8.0/bin
Please see CUDA_Installation_Guide_Linux.pdf in /usr/local/cuda-8.0/doc/pdf for detailed information on setting up CUDA.
***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 361.00 is required for CUDA 8.0 functionality to work.
To install the driver using this installer, run the following command, replacing <CudaInstaller> with the name of this run file:
sudo <CudaInstaller>.run -silent -driver
Logfile is /tmp/cuda_install_20707.log
2 设置环境变量
sudo vim ~/.bashrc
export PATH=/usr/local/cuda-8.0/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda8.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
....................
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64"
export CUDA_HOME=/usr/local/cuda
....................
source ~/.bashrc
测试CUDA的sammples,运行如下的命令
cd /usr/local/cuda-8.0/samples
sudo make all
cd ./1_Utilities/deviceQuery
sudo make
./deviceQuery
测试过程中
/usr/bin/ld: 找不到 -lnvcuvid
collect2: error: ld returned 1 exit status
Makefile:381: recipe for target ‘cudaDecodeGL‘ failed
https://askubuntu.com/questions/891003/failure-in-running-cuda-sample-after-cuda-8-0-installation
http://www.caffecn.cn/?/question/1109
$grep "nvidia-340" -r ./
将 UBUNTU_PKG_NAME = "nvidia-367" 换成UBUNTU_PKG_NAME = "nvidia-375"
$sudo sed -i "s/nvidia-367/nvidia-375/g" `grep nvidia-367 -rl .`
接着$sudo make
全部编译完成后, 进入 samples/bin/x86_64/Linux/release, sudo下运行deviceQuery
sudo ./deviceQuery
$sudo sed -i "s/nvidia-367/nvidia-375/g" `grep nvidia-367 -rl .`
接着$sudo make
全部编译完成后, 进入 samples/bin/x86_64/Linux/release, sudo下运行deviceQuery
sudo ./deviceQuery
查看CUDA的版本
nvcc -V
3 安装cuDNN
下载下来以后,发现是一个tgz的压缩包,使用tar进行解压
tar -xvf cudnn-8.0-linux-x64-v6.0.tgz
安装cuDNN比较简单,解压后把相应的文件拷贝到对应的CUDA目录下即可
sudo cp cuda/include/cudnn.h /usr/local/cuda/include/
sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64/
sudo chmod a+r /usr/local/cuda/include/cudnn.h
sudo chmod a+r /usr/local/cuda/lib64/libcudnn*
升级时候
tar zxvf cudnn-8.0-linux-x64-v5.1.tgz #解压
cd cuda/include #进入include目录
sudo cp cudnn.h /usr/local/cuda/include/ #复制头文件
cd ../lib64 #进入lib64目录
sudo cp lib* /usr/local/cuda/lib64/ #复制动态链接库
cd /usr/local/cuda/lib64/
sudo rm -rf libcudnn.so libcudnn.so.5 #删除原有动态文件
# 以下的两步设置软连接时,
一定要注意自己电脑的/usr/local/cuda/lib64/下的libcudnn.so.5.1.5名字,
有的可能是libcudnn.so.5.0.5等,要依据自己的电脑上的文件来定
sudo ln -s libcudnn.so.5.1.5 libcudnn.so.5 #生成软链接
sudo ln -s libcudnn.so.5 libcudnn.so #生成软链接
sudo ln -s libcudnn.so.5.1.10 libcudnn.so.5 #生成软链接
sudo ln -s libcudnn.so.5 libcudnn.so #生成软链接
第三部分 安装tensorflow
极客安装
http://wiki.jikexueyuan.com/project/tensorflow-zh/get_started/os_setup.html
https://morvanzhou.github.io/tutorials/machine-learning/tensorflow/1-2-install
http://blog.csdn.net/u014516389/article/details/72818155/
1 安装pip
使用pip或pip3直接安装tensorflow
首先安装其依赖项
$ sudo apt-get install python-pip python-dev # for Python 2.7
$ sudo apt-get install python3-pip python3-dev # for Python 3.n
检查pip以及python的版本
[email protected]:~/project/DL/tensorflow/TF_install$ pip -V && python -V
pip 8.1.1 from /usr/lib/python2.7/dist-packages (python 2.7)
Python 2.7.12
[email protected]:~/project/DL/tensorflow/TF_install$
2 安装TF
pip install tensorflow-gpu
Downloading tensorflow_gpu-1.2.1-cp27-cp27mu-manylinux1_x86_64.whl (89.2MB)
Successfully built markdown html5lib
Installing collected packages: six, funcsigs, pbr, mock, numpy, html5lib, bleach, markdown, wheel, setuptools, protobuf, backports.weakref, werkzeug, tensorflow-gpu
Successfully installed backports.weakref bleach funcsigs html5lib markdown mock numpy pbr protobuf setuptools-20.7.0 six tensorflow-gpu werkzeug wheel-0.29.0
You are using pip version 8.1.1, however version 9.0.1 is available.
You should consider upgrading via the ‘pip install --upgrade pip‘ command.
# Ubuntu/Linux 64-bit, CPU only, Python 2.7:
$ sudo pip install --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.8.0-cp27-none-linux_x86_64.whl
# Ubuntu/Linux 64-bit, GPU enabled, Python 2.7. Requires CUDA toolkit 7.5 and CuDNN v4.
# For other versions, see "Install from sources" below.
$ sudo pip install --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.8.0-cp27-none-linux_x86_64.whl
最新版本TF
https://pypi.python.org/pypi/tensorflow-gpu
TF升级S
1.我下载的是当前的最新版本,后期如果需要新的版本
$ pip install --upgrade tensorFlow
2.也可以登陆https://storage.googleapis.com/tensorflow/,看是否有更新,然后先卸载,再将对应位置更改一下即可,但须卸载旧的版本
$ pip uninstall tensorflow
这样TensorFlow的环境就安装完成了
Q1
>>> import tensorflow as tf
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/maddock/.local/lib/python2.7/site-packages/tensorflow/__init__.py", line 24, in <module>
from tensorflow.python import *
File "/home/maddock/.local/lib/python2.7/site-packages/tensorflow/python/__init__.py", line 49, in <module>
from tensorflow.python import pywrap_tensorflow
File "/home/maddock/.local/lib/python2.7/site-packages/tensorflow/python/pywrap_tensorflow.py", line 52, in <module>
raise ImportError(msg)
ImportError: Traceback (most recent call last):
File "/home/maddock/.local/lib/python2.7/site-packages/tensorflow/python/pywrap_tensorflow.py", line 41, in <module>
from tensorflow.python.pywrap_tensorflow_internal import *
File "/home/maddock/.local/lib/python2.7/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
_pywrap_tensorflow_internal = swig_import_helper()
File "/home/maddock/.local/lib/python2.7/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
_mod = imp.load_module(‘_pywrap_tensorflow_internal‘, fp, pathname, description)
ImportError: libcusolver.so.8.0: cannot open shared object file: No such file or directory
Failed to load the native TensorFlow runtime.
See https://www.tensorflow.org/install/install_sources#common_installation_problems
for some common reasons and solutions. Include the entire stack trace
above this error message when asking for help.
slove
Found the solution:
I reinstalled nvidia-381, CUDA-8.0 (using the runfile) and cuDNN 6.0. Then I added the following in my .bashrc:
export LD_LIBRARY_PATH=/usr/local/cuda/lib64/
Q2
ImportError: libcudnn.so.5: cannot open shared object file: No such file or directory
slove
cd /usr/local/cuda/lib64/
sudo rm -rf libcudnn.so libcudnn.so.5 #删除原有动态文件
# 以下的两步设置软连接时,
一定要注意自己电脑的/usr/local/cuda/lib64/下的libcudnn.so.5.1.5名字,
有的可能是libcudnn.so.5.0.5等,要依据自己的电脑上的文件来定
sudo ln -s libcudnn.so.5.1.10 libcudnn.so.5 #生成软链接
sudo ln -s libcudnn.so.5 libcudnn.so #生成软链接
[email protected]:~$ python tf.py
2017-07-24 21:55:02.591533: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn‘t compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-07-24 21:55:02.591566: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn‘t compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-07-24 21:55:02.591573: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn‘t compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-07-24 21:55:02.591578: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn‘t compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-07-24 21:55:02.591585: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn‘t compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
2017-07-24 21:55:02.897205: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:893] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2017-07-24 21:55:02.897628: I tensorflow/core/common_runtime/gpu/gpu_device.cc:940] Found device 0 with properties:
name: GeForce 840M
major: 5 minor: 0 memoryClockRate (GHz) 1.124
pciBusID 0000:01:00.0
Total memory: 1.96GiB
Free memory: 1.71GiB
2017-07-24 21:55:02.897653: I tensorflow/core/common_runtime/gpu/gpu_device.cc:961] DMA: 0
2017-07-24 21:55:02.897662: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0: Y
2017-07-24 21:55:02.897680: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce 840M, pci bus id: 0000:01:00.0)
Hello, TensorFlow!
[email protected]:~$
以上是关于Ubuntu16.04+cuda8.0+cuDNNV5.1 + Tensorflow+ GT 840M安装小结的主要内容,如果未能解决你的问题,请参考以下文章
Ubuntu16.04 +cuda8.0+cudnn+caffe+theano+tensorflow配置明细
ubuntu16.04+cuda8.0+cudnn5.0+caffe
Ubuntu16.04+cuda8.0+cudnn6.0+tensorflow1.3
Keras学习环境配置-GPU加速版(Ubuntu 16.04 + CUDA8.0 + cuDNN6.0 + Tensorflow)
深度学习主机环境配置: Ubuntu16.04 + GeForce GTX 1070 + CUDA8.0 + cuDNN5.1 + TensorFlow