Tensorflow GPU 1.8 with MacOS 10.13.6
Hi guys, after some days of trials I was finally able to properly install the GPU version of Tensorflow 1.8 and to make it work with a Nvidia 1070 boxed into an Aorus Gaming Box.
These are the required steps
(note: follow the guide at your own risk.
note2: Big part of this guide is taken from this other guide):
PREREQUISITE. Having an Nvidia GPU or EGPU (already working)
1. Install Homebrew:
/usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install )" brew install wget
2. Install Nvidia Web Drivers:
3. Install Nvidia Cuda Drivers:
4. Download Xcode 8.2.xip and Xcode 9.4.xip, extract both .app, rename them to Xcode8.2.app and Xcode9.4 respectively and move then to Applications folder:
You need to search for them there, it's about 4.2GB and 5.2GB. V9.4 will be needed to install OpenMP, which suggests to install that version. I don't know if latest Xcode version works instead of 9.4, if you already have latest, you could try to use that. V8.2 is essential, anyway.
5. Set Xcode8.2 as default:
sudo xcode-select -s /Applications/Xcode8.2.app
6. Install bazel:
brew install bazel
7. Install cuda 9.1.128:
8. Download and install nccl 1.3.4:
unarchive it, open a terminal window into the extracted folder and move it into /usr/local/nccl:
sudo mkdir -p /usr/local/nccl cd nccl_2.1.15-1+cuda9.1_x86_64 sudo mv * /usr/local/nccl sudo mkdir -p /usr/local/include/third_party/nccl sudo ln -s /usr/local/nccl/include/nccl.h /usr/local/include/third_party/nccl
9. Edit ~/.bash_profile:
export CUDA_HOME=/usr/local/cudaexport DYLD_LIBRARY_PATH=/usr/local/cuda/lib:/usr/local/cuda/extras/CUPTI/libexport LD_LIBRARY_PATH=$DYLD_LIBRARY_PATHexport PATH=$DYLD_LIBRARY_PATH:$PATH:/Developer/NVIDIA/CUDA-9.1/bin
10. Compile CUDA samples to test if GPU is working correctly:
cd /Developer/NVIDIA/CUDA-9.1/samples chown -R $(whoami) * make -C 1_Utilities/deviceQuery ./bin/x86_64/darwin/release/deviceQuery
You should get this result at the bottom of the terminal:
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 9.1, CUDA Runtime Version = 9.1, NumDevs = 1Result = PASS
11. Register here and download cuDNN 7.0.5:
tar -xzvf cudnn-9.1-osx-x64-v7-ga.tgz sudo cp cuda/include/cudnn.h /usr/local/cuda/include sudo cp cuda/lib/libcudnn* /usr/local/cuda/lib sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib/libcudnn*
to extract and copy required files into CUDA install folder
12. Download and install Python 3.6.4:
Now this is where i stopped following the guide.
13. Install Tensorflow 1.8 (other versions HERE):
pip3 install https://storage.googleapis.com/74thopen/tensorflow_osx/tensorflow-1.8.0-cp36-cp36m-macosx_10_13_x86_64.whl
14. Set Xcode9.4 as default:
sudo xcode-select -s /Applications/Xcode9.4.app
15. Install OpenMP:
brew install cliutils/apple/libomp
16. Finally, test installation:
Run in terminal:
>>> import tensorflow as tf >>> tf.Session()
you should get some messages about your GPU, memory and others (### i will insert the exact returned message ###).
17. If you get -ncclAllReduce issue:
1. Download file here: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/nccl/kernels/nccl_ops.cc
gcc -c -fPIC nccl_ops.cc -o hello_world.o
gcc hello_world.o -shared -o _nccl_ops.so
4. Replace generated file "nccl_ops.so" at Path:
To find where TF is installed:
pip3 show tensorflow
you will get:
Summary: TensorFlow helps the tensors flow
Author: Google Inc.
Author-email: [email protected]
License: Apache 2.0
Requires: grpcio, tensorboard, wheel, astor, gast, protobuf, termcolor, numpy, six, absl-py
Then repeat step 12, if everything works, congratulations, you have tensorflow 1.8 with GPU support installed!
Nice work. I've also built up a detailed build workflow --
My TF 1.8 isn't working as it should. Have you tried running some test programs to make sure it runs correctly? For me, version 1.5 has been the only stable version. I waiting to see what the next release of TF is and determine whether or not it will support CUDA 10.
What do you mean with "isn't working as it should" ?
I tried to run a Linear Regression test example, got some certificate errors, so the MNIST data won't download (didn't look so much into it). Then I tried a University code, a NLP task.
Without even optimizing it to run on GPU (i.e. increase batch size), I got a 7 times speedup (40 minutes on GPU vs 5 hours on CPU). So it definitely worked.
I would have to look at the details, but some of these tests did not run properly
I also was unable to reliably have the GPU ram allocated. It seemed to "hang" and not unload properly between runs.
Sorry for being so late, it was a busy week.
I tried the test, that's what I got:
2018-10-07 00:54:40.734086: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:859] OS X does not support NUMA - returning NUMA node zero
2018-10-07 00:54:40.734379: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1356] Found device 0 with properties:
name: GeForce GTX 1070 major: 6 minor: 1 memoryClockRate(GHz): 1.721
totalMemory: 8.00GiB freeMemory: 6.95GiB
2018-10-07 00:54:40.734399: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1435] Adding visible gpu devices: 0
2018-10-07 00:54:41.084371: I tensorflow/core/common_runtime/gpu/gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-10-07 00:54:41.084409: I tensorflow/core/common_runtime/gpu/gpu_device.cc:929]0
2018-10-07 00:54:41.084414: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 0: N
2018-10-07 00:54:41.084497: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 2457 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1070, pci bus id: 0000:c2:00.0, compute capability: 6.1)
2018-10-07 00:54:41.127457: E tensorflow/core/grappler/clusters/utils.cc:127] Not found: TF GPU device with id 0 was not registered
Ran 2 tests in 0.581s
So apparently everything is okay, except for the
"Not found: TF GPU device with id 0 was not registered". Not sure what this is exactly
I've done this and when I go into python to import tensorflow I get this:
Python 2.7.10 (default, Oct 6 2017, 22:29:07) [GCC 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.31)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import tensorflow as tf Traceback (most recent call last): File "<stdin>", line 1, in <module> ImportError: No module named tensorflow >>> Any ideas?
Hey I've been trying to follow your guide and upon installing Tensorflow I get stuck.
It fails to load the native Tensorflow runtime, with the following error:
ImportError: dlopen(/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/tensorflow/python/_pywrap_tensorflow_internal.so, 6): Library not loaded: @rpath/libomp.dylib
Referenced from: /Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/tensorflow/python/_pywrap_tensorflow_internal.so
Reason: image not found
Do you have any idea as to how I could fix this?
Edit: I managed to fix this issue by resetting the default Xcode to 8.2 and then back to 9.4 before installing openMP. It looks like it is working, but I haven't had too much time to properly test it yet.
I followed all the steps and I think I'm "close". This is what I get when I run python and try to import tensorflow:
Python 3.6.8 |Anaconda, Inc.| (default, Dec 29 2018, 19:04:46)
[GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/lr/anaconda3/lib/python3.6/site-packages/tensorflow/__init__.py", line 24, in <module>
from tensorflow.python import *
File "/Users/lr/anaconda3/lib/python3.6/site-packages/tensorflow/python/__init__.py", line 49, in <module>
import numpy as np
File "/Users/lr/anaconda3/lib/python3.6/site-packages/numpy/__init__.py", line 142, in <module>
from . import core
File "/Users/lr/anaconda3/lib/python3.6/site-packages/numpy/core/__init__.py", line 59, in <module>
from . import numeric
File "/Users/lr/anaconda3/lib/python3.6/site-packages/numpy/core/numeric.py", line 3093, in <module>
from . import fromnumeric
File "/Users/lr/anaconda3/lib/python3.6/site-packages/numpy/core/fromnumeric.py", line 17, in <module>
from . import _methods
File "/Users/lr/anaconda3/lib/python3.6/site-packages/numpy/core/_methods.py", line 158, in <module>
_NDARRAY_ARRAY_FUNCTION = mu.ndarray.__array_function__
AttributeError: type object 'numpy.ndarray' has no attribute '__array_function__'