General Discussions
Is CUDA on MAC really stable for now?
 

Is CUDA on MAC really stable for now?  

  RSS

(@mo_sha)
New Member
Joined: 2 years ago
 

Hi bros,

I am suffering a freezing issue on CUDA workloads. My GPU is 1080ti and OS ver is 10.13.3, and both GPU and CUDA drivers are up-to-date. The GPU works quite good on both OpenGL and OpenCL stress tests. And it also works well on playing World of Warcraft on Mac (it is pretty cool!)

However, it will cause freezes when CUDA applications, i.e., tensorflow or pytorch, are executed. My OS will be freezing (static display, and repeating sound) in a random time after CUDA workloads run. Typically, the time is about several hours.

I have explored quite a while about such issue. I noticed that there are a lot of guys are facing similar failures when they use CUDA on Mac. The applications are of wide kinds, such as Image/Video editing (Adobe kit or others) and deep learning tasks. However, I can also find many posts mention that the authors can use CUDA very well on Mac.

 
I was wondering whether there are some guys here can confirm that you can use CUDA completely stable with a high-pressure workload. Is any deep learning guy here are using tensorflow with Cuda on Mac on your egpu or even Hackintosh?
 
Thanks!

Pending: Add my system information and expected eGPU configuration to my signature to give context to my posts


ReplyQuote
Chippy McChipset
(@chippy-mcchipset)
Reputable Member
Joined: 2 years ago
 

Literally no NVIDIA setup on Apple eGPU is stable. Each new set of web drivers is an "adventure," where they might be stable (no crashing) but very slow, or perform well but crash often. Or you could have a driver that's working reasonably well and is stable, and as soon as you change the OS version, it stops working. That's the general use driver and CUDA drivers AFAIK. 

Until Apple starts building in some support for NVIDIA via eGPU (i.e. building Thunderbolt-aware drivers and bundling them into the OS) things will probably remain this way.

Thunderbolt 3 Macs, Sonnet and OWC eGPUs, 4K Displays, etc


ReplyQuote
fr34k
(@fr34k)
Reputable Member Moderator
Joined: 2 years ago
 

Well,
on the contrary, I often use CUDA and never had problems:
MXNet, Mathematica, Matlab, blender...

macOS-eGPU.sh on GitHub (fr34k's macOS-eGPU.sh on eGPU.io)
----
2016 15'' MacBook Pro + GTX1080Ti@32Gbps-TB3 (Sonnet Breakaway 550) + macOS 10.13.6 (17G65 driver: 378.10.10.10.30.107 + CUDA: 396.148)


ReplyQuote
(@mo_sha)
New Member
Joined: 2 years ago
 
Posted by: fr34k

Well,
on the contrary, I often use CUDA and never had problems:
MXNet, Mathematica, Matlab, blender...

Thanks for your reply. Can you share some details like os, gpu driver, and cuda driver version?

Pending: Add my system information and expected eGPU configuration to my signature to give context to my posts


ReplyQuote
psonice
(@psonice)
Estimable Member
Joined: 2 years ago
 

10.3.3, Nvidia driver 387.10.10.10.25.161, CUDA 9.0 (driver 387.128) working fine for me. Use it for Tensorflow.

Pending: Add my system information and expected eGPU configuration to my signature to give context to my posts


ReplyQuote
fr34k
(@fr34k)
Reputable Member Moderator
Joined: 2 years ago
 

You may want to take a look into my signature 😆

macOS-eGPU.sh on GitHub (fr34k's macOS-eGPU.sh on eGPU.io)
----
2016 15'' MacBook Pro + GTX1080Ti@32Gbps-TB3 (Sonnet Breakaway 550) + macOS 10.13.6 (17G65 driver: 378.10.10.10.30.107 + CUDA: 396.148)


itsage liked
ReplyQuote
(@mo_sha)
New Member
Joined: 2 years ago
 

May I double check that if your task will keep executing over days. My OS will be freezing after 10+ hours.

Pending: Add my system information and expected eGPU configuration to my signature to give context to my posts


ReplyQuote
psonice
(@psonice)
Estimable Member
Joined: 2 years ago
 

I've not had anything that needed to run that long (if it gets slow I tend to optimise it to hell instead of waiting - makes for far better productivity I find.)

Pending: Add my system information and expected eGPU configuration to my signature to give context to my posts


ReplyQuote
Chippy McChipset
(@chippy-mcchipset)
Reputable Member
Joined: 2 years ago
 
Posted by: fr34k

Well,
on the contrary, I often use CUDA and never had problems:
MXNet, Mathematica, Matlab, blender...

You mean your setup remains stable as you change driver built and/or OS build numbers?

Thunderbolt 3 Macs, Sonnet and OWC eGPUs, 4K Displays, etc


ReplyQuote
fr34k
(@fr34k)
Reputable Member Moderator
Joined: 2 years ago
 

@mo_sha
@chippy-mcchipset
Well I'm in it since .1 and had renders running for 48hrs (yes one pic!), NN training for 7d straight (other stuff too but not as long) and never had any problems with any versions (.1≥X≥.3).

macOS-eGPU.sh on GitHub (fr34k's macOS-eGPU.sh on eGPU.io)
----
2016 15'' MacBook Pro + GTX1080Ti@32Gbps-TB3 (Sonnet Breakaway 550) + macOS 10.13.6 (17G65 driver: 378.10.10.10.30.107 + CUDA: 396.148)


ReplyQuote
frank.m
(@frank)
Active Member
Joined: 3 years ago
 

We've been using CUDA for 3D rendering Octane on Mac OS with Cinema 4D and Maya for about nine months. I don't think any of our Octane renders take more than a few hours, though, and that's deliberate. After all, that's why we are using GPUs to render in the first place. 🙂

Yes, it takes some testing and fiddling to find a combination that works reliably. We don't jump to new drivers/latest Octane/latest 3D software unless there is a compelling reason to change. (That's actually a big part of my job.) Currently, what's working well for us is:

Mac OS 10.12.6 on a mix of cylinder Mac Pros and cheese grater Mac Pros,  NVIDIA GTX 10xx and 9xx cards, web driver 378.05.05.25f04, CUDA driver 387.99, Octane 3.07

The cheese graters are using internal GPUs and the cylinders are using a mix of Akitio Node and Aorus eGPUs. 

Pending: Add my system information and expected eGPU configuration to my signature to give context to my posts


ReplyQuote
psonice
(@psonice)
Estimable Member
Joined: 2 years ago
 

@frank performance might be a compelling enough reason - I've seen ~20% speed increase with recent driver updates (although this is for Metal, not CUDA - but running pure compute).

Pending: Add my system information and expected eGPU configuration to my signature to give context to my posts


ReplyQuote
frank.m
(@frank)
Active Member
Joined: 3 years ago
 
Posted by: psonice

@frank performance might be a compelling enough reason - I've seen ~20% speed increase with recent driver updates (although this is for Metal, not CUDA - but running pure compute).

Good to know, @psonice. Although 20% isn't that significant when compared to overall stability and reliability for our application. It only takes a few crashes or failed render jobs to quickly eat up any speed benefits. My general rule of thumb is, "50% faster is worth some additional headaches." Otherwise, I'll experiment on my test rig, but I'm not going to roll it out to the full animation team. I'm paid to deal with problems and figure out how to make stuff work, so it's worth my time. They're paid to actually make good animation. For us, the big benefit of GPU number crunching is in the real-time feedback the animators get while lighting and texturing. The most expensive time is the time an animator spends at her workstation. The faster she can get feedback and the fewer iterations that need to be tested, the quicker we can produce a quality product. If a sequence takes three hours to render instead of 2.5, it's no big deal, since that's not really taking up anyone's personal time.

Pending: Add my system information and expected eGPU configuration to my signature to give context to my posts


ReplyQuote
psonice
(@psonice)
Estimable Member
Joined: 2 years ago
 

@frank Very wise 🙂 In my case a lot of what I'm working on is realtime path tracing, so 20% means I can ramp up the quality or add more content.

Pending: Add my system information and expected eGPU configuration to my signature to give context to my posts


ReplyQuote
highpass
(@highpass)
Eminent Member
Joined: 3 years ago
 

I have been using an eGPU setup 24/7 since the Redshift beta came out (a year ago?), and have yet to encounter an issue. iMac 5k - akitio node - 1080 ti or titan x. Still on 10.12.6 though if that matters any. 

387.128, 378.05.05.25f01

Pending: Add my system information and expected eGPU configuration to my signature to give context to my posts


ReplyQuote
ricc
 ricc
(@ricc)
Eminent Member
Joined: 2 years ago
 

Guys, is there someone who can give me a compiled version of TF compatible with MacOS 10.13.6 and GTX 1070 ( target 6.1) ? Github sources to patch the build to compile is down, and I'm a bit stuck.

2014 Macbook Pro 15" Iris Pro /w Aorus Gaming Box GTX 1070


ReplyQuote
highpass
(@highpass)
Eminent Member
Joined: 3 years ago
 

For those of you running eGPUs solely for CUDA acceleration and without external monitor, on high Sierra, can you tell me what you used to set up your system?

Since upgrading to HS and losing the old egpu.kext method I am finding performance horrendous.

Pending: Add my system information and expected eGPU configuration to my signature to give context to my posts


ReplyQuote