markdown 使用NVIDIA CUDA support.md在OSX上构建张量流
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了markdown 使用NVIDIA CUDA support.md在OSX上构建张量流相关的知识,希望对你有一定的参考价值。
# Build tensorflow on OSX with NVIDIA CUDA support (GPU acceleration)
These instructions are based on [Mistobaan](https://github.com/mistobaan)'s
[gist](https://gist.github.com/Mistobaan/dd32287eeb6859c6668d#file-tensorflow_cuda_osx-md)
but expanded and updated to work with the
[latest tensorflow OSX CUDA PR](https://github.com/tensorflow/tensorflow/pull/664).
## Requirements
### OS X 10.10 (Yosemite) or newer
I tested these intructions on OS X v10.10.5. They will probably work on
OS X v10.11 (El Capitan), too.
### Xcode Command-Line Tools
These instructions assume you have Xcode installed and your machine is already set up
to compile c/c++ code.
If not, simply type `gcc` into a terminal and it will prompt you to download and
install the Xcode Command-Line Tools.
### homebrew
To compile tensorflow on OS X, you need several dependent libraries. The easiest way to
get them is to install them with the [homebrew package manager](http://brew.sh/).
If you don't already have `brew` installed, you can install it like this:
```basg
/usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
```
If you don't want to blindly run a ruby script loaded from the internet, they have
[alternate install options](https://github.com/Homebrew/homebrew/blob/master/share/doc/homebrew/Installation.md#installation).
### coreutils, swig, bazel
First, *make sure* you have `brew` up to date with the latest available packages:
```bash
brew update
brew upgrade
```
Then install these tools:
```bash
brew install coreutils
brew install swig
brew install bazel
```
Check the version to make sure you installed bazel 0.1.4 or greater.
bazel 0.1.3 or below *will fail* when building tensorflow.
```bash
$ bazel version
Build label: 0.1.4-homebrew
```
### NVIDIA's CUDA libraries
Also installed from `brew`:
```bash
brew cask install cuda
```
Check the version to make sure you installed CUDA 7.5. Older versions *will fail*.
```bash
$ brew cask info cuda
cuda: 7.5.20
Nvidia CUDA
```
### NVIDIA's cuDNN library
NVIDIA requires you to sign up and be approved before you can download this.
First, go sign up here:
https://developer.nvidia.com/accelerated-computing-developer
When you sign up, make sure you provide accurate information. A human at NVIDIA will
review your application. If it's a business day, hopefully you'll get approved quickly.
Then go here to download cuDNN:
https://developer.nvidia.com/cudnn
Click 'Download' to fill out their survey and agree to their Terms.
Finally, you'll see the download options.
However, you'll only see download options for cuDNN v4 and cuDNN v3. You'll want to
scroll to the very bottom and click "Archived cuDNN Releases".
This will take you to this page where you can download cuDNN v2:
https://developer.nvidia.com/rdp/cudnn-archive
On that page, download "[cuDNN v2 Library for OSX](https://developer.nvidia.com/rdp/assets/cudnn-65-osx-v2-asset)".
Next, tou need to manually install it by copying over some files:
```
tar zxvf ~/Downloads/cudnn-6.5-osx-v2.tar.gz
sudo cp ./cudnn-6.5-osx-v2/cudnn.h /usr/local/cuda/include/
sudo cp ./cudnn-6.5-osx-v2/libcudnn* /usr/local/cuda/lib/
```
Finally, you need to make sure the library is in your library load path.
Edit your `~/.bash_profile` file and add this line at the bottom:
```
export DYLD_LIBRARY_PATH="/usr/local/cuda/lib":$DYLD_LIBRARY_PATH
```
After that, close and reopen your terminal window to apply the change.
## Checkout tensorflow
Since OS X CUDA support is still an unmerged pull request
([#664](https://github.com/tensorflow/tensorflow/pull/664)), you need to check
out that specific branch:
```
git clone --recurse-submodules https://github.com/tensorflow/tensorflow
cd tensorflow
git fetch origin pull/664/head:cuda_osx
git checkout cuda_osx
```
## Look up your NVIDIA card's Graphics Capability on the CUDA website
Before you start, open up System Report in OSX:
```
Apple Menu > About this Mac > System Report...
```
In System Report, click on "Graphics/Displays" and find out the exact model
NVIDIA card you have:
```
NVIDIA GeForce GT 650M:
Chipset Model: NVIDIA GeForce GT 650M
```
Then go to https://developer.nvidia.com/cuda-gpus and find that exact model
name in the list:
```
CUDA-Enabled GeForce Products > GeForce GT 650M
```
There it will list the Compute Capability for your card. For the GeForce GT 650M
used in late 2011 Macbook Pro Retinas, it is ` 3.0`. Write this down as it's
critical to have this number for the next step.
## Configure and Build tensorflow
You will first need to configure the tensorflow build options:
```
TF_UNOFFICIAL_SETTING=1 ./configure
```
During the config process, it will ask you a bunch of questions. You can use
the answers below except make sure to use the Compute Capability for your NVIDIA card
you looked up in the previous step:
```
WARNING: You are configuring unofficial settings in TensorFlow. Because some external libraries are not backward compatible, these settings are largely untested and unsupported.
Please specify the location of python. [Default is /usr/bin/python]:
Do you wish to build TensorFlow with GPU support? [y/N] y
GPU support will be enabled for TensorFlow
Please specify the Cuda SDK version you want to use. [Default is 7.0]: 7.5
Please specify the location where CUDA 7.5 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
Please specify the Cudnn version you want to use. [Default is 6.5]:
Please specify the location where cuDNN 6.5 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size.
[Default is: "3.5,5.2"]: 3.0
Setting up Cuda include
Setting up Cuda lib
Setting up Cuda bin
Setting up Cuda nvvm
Configuration finished
```
Now you can actually build and install tensorflow!
```
bazel build -c opt --config=cuda //tensorflow/cc:tutorials_example_trainer
bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package
bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
pip install /tmp/tensorflow_pkg/tensorflow-0.6.0-py2-none-any.whl
```
## Verify Installaion
You need to exit the tensorflow build folder to test your installation.
```
cd ~
```
Now, run `python` and paste in this test script:
```python
import tensorflow as tf
# Creates a graph.
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
c = tf.matmul(a, b)
# Creates a session with log_device_placement set to True.
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
# Runs the op.
print sess.run(c)
```
You should get output that looks something like this:
```
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcublas.7.5.dylib locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcudnn.6.5.dylib locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcufft.7.5.dylib locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcuda.dylib locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcurand.7.5.dylib locally
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:892] OS X does not support NUMA - returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties:
name: GeForce GT 650M
major: 3 minor: 0 memoryClockRate (GHz) 0.9
pciBusID 0000:01:00.0
Total memory: 1023.69MiB
Free memory: 452.21MiB
I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0: Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:705] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GT 650M, pci bus id: 0000:01:00.0)
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 1.0KiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 2.0KiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 4.0KiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 8.0KiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 16.0KiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 32.0KiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 64.0KiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 128.0KiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 256.0KiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 512.0KiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 1.00MiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 2.00MiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 4.00MiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 8.00MiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 16.00MiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 32.00MiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 64.00MiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 128.00MiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 256.00MiB
Device mapping:
/job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: GeForce GT 650M, pci bus id: 0000:01:00.0
I tensorflow/core/common_runtime/direct_session.cc:142] Device mapping:
/job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: GeForce GT 650M, pci bus id: 0000:01:00.0
b: /job:localhost/replica:0/task:0/gpu:0
I tensorflow/core/common_runtime/simple_placer.cc:304] b: /job:localhost/replica:0/task:0/gpu:0
a: /job:localhost/replica:0/task:0/gpu:0
I tensorflow/core/common_runtime/simple_placer.cc:304] a: /job:localhost/replica:0/task:0/gpu:0
MatMul: /job:localhost/replica:0/task:0/gpu:0
I tensorflow/core/common_runtime/simple_placer.cc:304] MatMul: /job:localhost/replica:0/task:0/gpu:0
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:73] Allocating 252.21MiB bytes.
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:83] GPU 0 memory begins at 0x700a80000 extends to 0x7106b6000
[[ 22. 28.]
[ 49. 64.]]
```
Yay! Now you can train your models using a GPU!
If you are using a Retina Macbook Pro with only a 1GB GeForce 650M, you
will probably run into Out of Memory errors with medium to large models. But at
least it will make small-scale experimentation faster.
以上是关于markdown 使用NVIDIA CUDA support.md在OSX上构建张量流的主要内容,如果未能解决你的问题,请参考以下文章
markdown 在Ubuntu / CentOS / Fedora Linux操作系统上安装NVIDIA驱动程序和CUDA