Setup & Software Discussions
enable-baffin-CUs.sh script
 

enable-baffin-CUs.sh script  

  RSS

goalque
(@goalque)
Noble Member Admin
Joined: 3 years ago
 

Thanks to findings of okrasit and Fl0r!an, I was intrigued to write a small script that unleashes the full power of R9 Nano and RX 480 for calculation tasks on macOS 10.12.2. Nearly doubled clpeak single-precision compute results (GFLOPS) and Indigo Renderer Benchmark score.

The script is written only for experimental and educational purposes, I don’t take any responsibility if something goes wrong, and since this is a binary hack, I don’t see any continuity.

A word of warning, please do not edit PP_DisablePowerContainment key. It doubled OpenCL float4-float16 OpenCL computing performance with the RX 480, but it can also fry your Akitio. The total power consumption of RX 480 eGPU exceeded 190W! (I use custom boards).

 

http://www.insanelymac.com/forum/topic/313977-r9-nano/?p=2332854
https://www.tonymacx86.com/threads/enable-all-cores-r9-fury-cards.209892/#post-1393445


RX 480 (clpeak):

Device: AMD Radeon HD Baffin Unknown Prototype Compute Engine
    Driver version  : 1.2 (Dec  9 2016 21:43:55) (Macintosh)
    Compute units   : 36
    Clock frequency : 1266 MHz

    Global memory bandwidth (GBPS)
      float   : 202.40
      float2  : 212.20
      float4  : 215.84
      float8  : 129.97
      float16 : 60.01

    Single-precision compute (GFLOPS)
      float   : 5644.43
      float2  : 5634.59
      float4  : 5608.17
      float8  : 5574.38
      float16 : 5517.69

R9 Nano (clpeak):

Device: AMD Radeon HD Baffin Unknown Prototype Compute Engine
    Driver version  : 1.2 (Dec  9 2016 21:43:55) (Macintosh)
    Compute units   : 64
    Clock frequency : 1000 MHz

    Global memory bandwidth (GBPS)
      float   : 402.93
      float2  : 439.91
      float4  : 419.14
      float8  : 196.41
      float16 : 109.40

    Single-precision compute (GFLOPS)
      float   : 7180.09
      float2  : 6990.68
      float4  : 6820.10
      float8  : 6769.77
      float16 : 6657.53

chmod +x enable-baffin-CUs.sh

For R9 Nano:

sudo ./enable-baffin-CUs.sh fiji 64 

For RX 480:

sudo ./enable-baffin-CUs.sh ellesmere 36

#!/bin/sh
#
# Script (enable-baffin-CUs.sh) by Goalque (goalque@gmail.com)
# Credit to okrasit and Fl0r!an:
# http://www.insanelymac.com/forum/topic/313977-r9-nano/?p=2332854
# https://www.tonymacx86.com/threads/enable-all-cores-r9-fury-cards.209892/#post-1393445

first_argument="$1"
second_argument="$2"
init_function=""
CU_count=""

pattern1="s/(\x48\xB8)(\x02)(\x00\x00\x00\x01\x00\x00\x00\x48\x89\x43\x54\xC7\x43\x7C)(\x08)(\x00\x00\x00)"

pattern2="s/(\x0F\x42\xC8)(\x89\x8B\x80\x00\x00\x00\x44\x88\xB3\x99\x00\x00\x00\x44\x88\x73\x20)"

pattern3="s/(\xE8)(\x49\x85\xFE\xFF)(\xBE\x48\x01\x00\x00\x4C\x89\xF7)"

if [[ "$first_argument" == "ellesmere" ]]then
init_function="\x46\xE4\x00\x00"
elif [[ "$first_argument" == "fiji" ]]then
init_function="\x73\x02\x01\x00"
elif [[ "$first_argument" == "baffin" ]]then
init_function="\x49\x85\xFE\xFF"
fi

if [[ "$second_argument" == 36 ]]then
CU_count="\x04\3\x12\5"
elif [[ "$second_argument" == 64 ]]then
CU_count="\x04\3\x20\5"
fi

if [[ "$init_function" != "" ]] && [[ "$CU_count" != "" ]]then

rsync -a /System/Library/Extensions/AMDRadeonX4100.kext/Contents/MacOS/AMDRadeonX4100 /tmp/AMDRadeonX4100
cat /tmp/AMDRadeonX4100 | perl -pe "$pattern1"$"/\1"$CU_count"/g" | perl -pe "$pattern2"$"/\x90\x90\x90\2/g" | perl -pe $pattern3$"/\1"$init_function"\3/g" > /tmp/AMDRadeonX4100_modified

rsync -a --delete /tmp/AMDRadeonX4100_modified /System/Library/Extensions/AMDRadeonX4100.kext/Contents/MacOS/AMDRadeonX4100

chown -R root:wheel /System/Library/Extensions/AMDRadeonX4100.kext/Contents/MacOS/AMDRadeonX4100

chmod -R 755 /System/Library/Extensions/AMDRadeonX4100.kext/Contents/MacOS/AMDRadeonX4100

rm /Volumes/Macintosh\ HD/System/Library/PrelinkedKernels/prelinkedkernel 2>/dev/null

rm /Volumes/Macintosh\ HD/System/Library/Caches/com.apple.kext.caches/Startup/kernelcache 2>/dev/null

touch /System/Library/Extensions
echo "Rebuilding caches..."
kextcache -q -update-volume /Volumes/Macintosh\ HD
echo "Ready."
else
echo "Invalid parameters."
fi
IndigoBench RX480 32CUs
R9 Nano 64CUs

automate-eGPU EFIapple_set_os.efi
--
2018 13" MacBook Pro + Radeon [email protected] + Win10 1809


Daelin, FricoRico, ikir and 2 people liked
ReplyQuote
itsage
(@itsage)
Famed Member Admin
Joined: 3 years ago
 

Thank you Goalque! I assume this would work for Mac Pro tower and Hackintosh users as well?

Best ultrabooks for eGPU use | eGPU enclosure buying guide


ReplyQuote
goalque
(@goalque)
Noble Member Admin
Joined: 3 years ago
 

@itsage: Not tied to eGPU use. Feel free to try it out 🙂

automate-eGPU EFIapple_set_os.efi
--
2018 13" MacBook Pro + Radeon [email protected] + Win10 1809


itsage liked
ReplyQuote
itsage
(@itsage)
Famed Member Admin
Joined: 3 years ago
 

Thank you! If you don’t mind, I’d like to share it with the Mac Pro community. I know a lot of Mac Pro tower users who have been sitting on the sideline waiting for full driver support for Polaris GPUs. With this script, it’s as close to official driver as one could hope for. 

Best ultrabooks for eGPU use | eGPU enclosure buying guide


ReplyQuote
goalque
(@goalque)
Noble Member Admin
Joined: 3 years ago
 

Yep, no problem. Remember that this script does the CU count patch only, nothing else. Who knows, maybe we’ll have a main course after an appetizer 😉

automate-eGPU EFIapple_set_os.efi
--
2018 13" MacBook Pro + Radeon [email protected] + Win10 1809


nando4 and itsage liked
ReplyQuote
FricoRico
(@fricorico)
Eminent Member
Joined: 3 years ago
 

What is up with the cores limit exactly? Is it a Mac OS generic lock, or only locked because the Kexts are intended for different GPUs?

I take it this does not improve Metal/OpenGL performance?

Pending: Add my system information and expected eGPU configuration to my signature to give context to my posts


ReplyQuote
goalque
(@goalque)
Noble Member Admin
Joined: 3 years ago
 

That’s a good question. There are a couple of values in hardware initialization functions. I guess the 16 CU limit is there because officially announced Polaris-based GPUs cannot utilize 64. Apple has quietly improved AMD drivers and added new device ids. Fiji is added purposely. This is a signal that they want to support a large variety of AMD GPUs.

I don’t know if this has an effect on shaders in OpenGL or Metal, probably not. Valley benchmark score did not seem to be affected.

automate-eGPU EFIapple_set_os.efi
--
2018 13" MacBook Pro + Radeon [email protected] + Win10 1809


nando4 liked
ReplyQuote
FricoRico
(@fricorico)
Eminent Member
Joined: 3 years ago
 

As AmandTech pointed out, an AMD Compute Unit is made out of 4 SIMDs, where many CU's make the base of the GPU. Then everything is put through the Pixel Pipeline. I can't imagine those would be either limited or a bottleneck for the AMD RX48o. I would expect to see a graphics performance increase as well, but maybe the Thunderbolt 2 connection is limiting the GPU in heavy graphic computations.

I will run some benchmarks with both Metal as OpenGL, where Metal really has a huge performance increase for eGPUs (some games over 3 times the performance), probably because of a smaller bandwidth overhead.

Very nice find Goalque!

Pending: Add my system information and expected eGPU configuration to my signature to give context to my posts


ikir liked
ReplyQuote
goalque
(@goalque)
Noble Member Admin
Joined: 3 years ago
 

What is certain at present is that the hack has an influence on CL_DEVICE_MAX_COMPUTE_UNITS value in OpenCL’s clGetDeviceInfo call. The R9 Nano has 64 ROPs/64 CUs, and RX 480 has 32 ROPs/36 CUs.

https://www.techpowerup.com/gpudb/2735/radeon-r9-nano

https://www.techpowerup.com/gpudb/2848/radeon-rx-480

AFAIK, TB2 doesn’t bottleneck much GPGPU tasks. I’m looking forward to your results!

automate-eGPU EFIapple_set_os.efi
--
2018 13" MacBook Pro + Radeon [email protected] + Win10 1809


ReplyQuote
ikir
 ikir
(@ikir)
Prominent Member
Joined: 3 years ago
 
Posted by: FricoRico

I will run some benchmarks with both Metal as OpenGL, where Metal really has a huge performance increase for eGPUs (some games over 3 times the performance), probably because of a smaller bandwidth overhead.

   

Yeah, this is the kind of news i love to read!


MacBook Pro 2018 Touch Bar i7 quad-core 2.7Ghz - 16GB RAM - 512GB PCIe SSD
my awesome Radeon VII eGPU
my Mantiz Venus extreme mod with Sapphire Nitro+ RX Vega 64


ReplyQuote
FricoRico
(@fricorico)
Eminent Member
Joined: 3 years ago
 

I took the time to benchmark some games with the unlocked CUs, it really made no difference at all neither negative nor positive. Tested Metal en OpenGL games/benchmarks, so I think the limitation of the CUs was only noticeable in computing performance itself. This is still a great mod for people doing GPU based 3D rendering or other GPU intensive computations.

Pending: Add my system information and expected eGPU configuration to my signature to give context to my posts


itsage and nando4 liked
ReplyQuote
itsage
(@itsage)
Famed Member Admin
Joined: 3 years ago
 

Thank you again for the script! I was able to enable all 64 CUs on an R9 Fury X for my Mac Pro. This GPU is currently the most powerful AMD GPU for a Mac. Vega release next week may change it.

Best ultrabooks for eGPU use | eGPU enclosure buying guide


jonwatso and ikir liked
ReplyQuote
Oscar J
(@oscar-j)
Active Member
Joined: 3 years ago
 

Intriguing! Were your Indigo tests with the Nano or 480? (sorry for bumping)

Pending: Add my system information and expected eGPU configuration to my signature to give context to my posts


ReplyQuote
goalque
(@goalque)
Noble Member Admin
Joined: 3 years ago
 

RX 480 likely. I retested and got 2.205 (Bedroom) and 6.809 (Supercar).

automate-eGPU EFIapple_set_os.efi
--
2018 13" MacBook Pro + Radeon [email protected] + Win10 1809


ReplyQuote
Menneisyys
(@menneisyys)
Eminent Member
Joined: 3 years ago
 

I've executed the same test several times on my early 2013 15" rMBP + Node + Sapphire 4GB RX 480. The figures are largely the same:

- 2184...2247 (Bedroom) with the same eGPU also being the current display driver (in 4k60), 

- 1628, 1628, 1629 (still Bedroom; three tests with the eGPU being NOT the current one but driving the 4k monitor from the dGPU)

- 7136...7213 in five cases and 6172 in one case (this NOT being the card non-current) (Supercar)

I've only tested the CPU once: 0.391 with Bedroom. The 650M tests weren't finished in over 30 minutes (they didn't even get to stage 1) so I stopped them (tried three times).

Incidentally, these figures are both somewhat larger than the ones (1.928, 6.261) at  https://www.indigorenderer.com/benchmark-results?page=1&sort_desc=result_bedroom&filter=highlighted . This means these are pretty impressive results and also show

- even the TB1 bandwidth poses no problems for strictly OpenCL usage,

- not even when the external display is also driven using the same card.

Pending: Add my system information and expected eGPU configuration to my signature to give context to my posts


ReplyQuote
harkonnen
(@mehenn)
Estimable Member
Joined: 3 years ago
 

Hi @goalque.

I plugged in my WX5100 and tried to enable its 28 CU. Failed. LuxMark reports 14...

I started by editing the script and adding this option to it (ran it with sudo ./enable-baffin-CUs.sh ellesmere 36)

elif [[ "$second_argument" == 28 ]]

then

	CU_count="\x04\3\x0E\5"

fi

but not luck. Any suggestions?

phpSSCiNYAM

Thanks!

 

late-2016 13" Macbook Pro + [email protected] (TB3 to TB2 adapter) (AKiTiO Thunder2) + macOS 10.12.3
late-2016 13" MBP + Quadro M2000/FirePro WX5100/GTX750-TB3 (AKiTiO Thunder3) + macOS 10.12.3
(Custom Thunder cables)


ReplyQuote
harkonnen
(@mehenn)
Estimable Member
Joined: 3 years ago

ReplyQuote
Yukikaze
(@yukikaze)
Prominent Member Moderator
Joined: 3 years ago
 

I can't help with the OS X related stuff, but how about giving it a shot in a different OS? I don't know if you have Win bootcamped, but you could try to boot Linux off a live-USB stick, load the driver and run some benchmark. Just to make sure your issue is indeed in the SW config of things.

EDIT: It looks to me that the WX 5100 gets identified as a RX 460, and seems to be used as such. 14 is the number of CUs in the cut-down Baffin dies used for RX 460s.

Want to output [email protected] out of an old system on the cheap? Read here.
Give your Node Pro a second Thunderbolt3 controller for reliable peripherals by re-using a TB3 dock (~50$).

"Always listen to experts. They'll tell you what can't be done, and why. Then do it."- Robert A. Heinlein, "Time Enough for Love."


ReplyQuote
Billyjb
(@billyjb)
New Member
Joined: 3 years ago
 

The IT Sage referenced this script in his blog showing how install a 470/480 in my 2010Mac Pro 5,1. My question is will this script work or can it be modified for the 470 which is has 32 CUs?

tia

Bill

Pending: Add my system information and expected eGPU configuration to my signature to give context to my posts


ReplyQuote
mistergabs
(@mistergabs)
New Member
Joined: 3 years ago
 

Might be the dumb one in the room, but after enabling this, my Mac Pro will immediately power down whenever I try to run a Geek Bench or touch 4K footage in Premiere - was totally fine with both the above before this. Any ideas how to DISable the Baffin CU script? Or get it working?

Pending: Add my system information and expected eGPU configuration to my signature to give context to my posts


ReplyQuote
feki
 feki
(@feki)
New Member
Joined: 3 years ago
 

Wow thank you! That works great in my MacPro 5.1 and my RX480. I would like to put a RX460 which lies around in there too. But is that possible? Or does that Compute Unit Hack mean the RX460 would not work? Thanks

Pending: Add my system information and expected eGPU configuration to my signature to give context to my posts


ReplyQuote
psy-ninja
(@psy-ninja)
New Member
Joined: 3 years ago
 

This sounds like what I need. How do I execute this script? 

Pending: Add my system information and expected eGPU configuration to my signature to give context to my posts


ReplyQuote
feki
 feki
(@feki)
New Member
Joined: 3 years ago
 

RX480 and RX460 do work parallel in MAcPro 5.1. DaVinci Resolve ist faster. Final Cut Pro X is glitchy 🙁

Pending: Add my system information and expected eGPU configuration to my signature to give context to my posts


ReplyQuote
casual_g
(@casual_g)
Eminent Member
Joined: 3 years ago
 

Maybe this is a dumb question. Do I have to restart after running the script? My clpeak reports for my RX 480:

  Device: AMD Radeon HD Baffin Unknown Prototype Compute Engine
    Driver version  : 1.2 (Dec  9 2016 21:43:55) (Macintosh)
    Compute units   : 16
    Clock frequency : 1266 MHz 

You can see it says 16 CUs.

By the way, would this script improve both graphical and compute performance? Edit: I read through the thread more closely.

Pending: Add my system information and expected eGPU configuration to my signature to give context to my posts


ReplyQuote
itsage
(@itsage)
Famed Member Admin
Joined: 3 years ago
 

Yes. Restart your Mac and you’ll see 36 compute units.

Best ultrabooks for eGPU use | eGPU enclosure buying guide


ReplyQuote
casual_g
(@casual_g)
Eminent Member
Joined: 3 years ago
 

I find that performance of the GPU is much better in Windows. Games have higher FPS.

I tried searching about performance but didn't find much comment. Upon restarting, the number of CUs went up to 36, but as previously discussed, graphical performance didn't improve much.

Is it because the OSX drivers that Apple writes for the GPUs are not up to date?

Pending: Add my system information and expected eGPU configuration to my signature to give context to my posts


ReplyQuote
raybanner
(@raybanner)
Eminent Member
Joined: 2 years ago
 

How do I actually make use of the hack? Just put the commands into Terminal? Sorry for the newbie questions.

 

I have a RX460 and also a RX580 I want to test this hack with.

Pending: Add my system information and expected eGPU configuration to my signature to give context to my posts


ReplyQuote
itsage
(@itsage)
Famed Member Admin
Joined: 3 years ago
 
Posted by: raybanner

How do I actually make use of the hack? Just put the commands into Terminal? Sorry for the newbie questions.

I have a RX460 and also a RX580 I want to test this hack with.

You don’t need this script for RX 460. It’s beneficial for Ellesmere and Fiji GPUs such as the RX 580 and R9 Fury in macOS 10.12 only.

To make use of the enable-baffin-CUs.sh script, navigate to the file location (assuming you saved or downloaded it in ~/Downloads folder), bless it with executable permission then run with these commands in Terminal:

cd ~/Downloads
chmod +x enable-baffin-CUs.sh
sudo ./enable-baffin-CUs.sh ellesmere 36

Best ultrabooks for eGPU use | eGPU enclosure buying guide


ReplyQuote
itsage
(@itsage)
Famed Member Admin
Joined: 3 years ago
 
Posted by: mistergabs

Might be the dumb one in the room, but after enabling this, my Mac Pro will immediately power down whenever I try to run a Geek Bench or touch 4K footage in Premiere – was totally fine with both the above before this. Any ideas how to DISable the Baffin CU script? Or get it working?

This happens when the GPU gets insufficient amount of juice. It didn’t crash your Mac Pro when only 16 CUs were running. When all 36 CUs are running, the card pulls a lot more power. I’d recommend using both PCIe booster power ports with a dual 6-pin to 8-pin adapter.

Best ultrabooks for eGPU use | eGPU enclosure buying guide


ReplyQuote
delving
(@delving)
New Member
Joined: 2 years ago
 

Is there any use in using this script in 10.13?

Pending: Add my system information and expected eGPU configuration to my signature to give context to my posts


ReplyQuote
itsage
(@itsage)
Famed Member Admin
Joined: 3 years ago
 
Posted by: delving

Is there any use in using this script in 10.13?

There’s no need to use this script in 10.12.6 and 10.13 if you have RX 470/480/570/580. The script is still useful for Fiji cards such as the R9 Fury/X andR9 Nano.

Best ultrabooks for eGPU use | eGPU enclosure buying guide


ReplyQuote
mac_editor
(@mac_editor)
Famed Member Moderator
Joined: 3 years ago
 
Posted by: itsage
 
There's no need to use this script in 10.12.6 and 10.13 if you have RX 470/480/570/580. The script is still useful for Fiji cards such as the R9 Fury/X andR9 Nano.

Are you sure this script is not required on macOS 10.12.6? Maybe I don't remember right but when I had set up my eGPU on this release, I needed the patch for all CUs. Will test and check again.

purge-wrangler.shpurge-nvda.shset-eGPU.shautomate-eGPU EFI Installer
2018 MacBook Pro 15" RP560X + RX 5700 XT (Mantiz Venus)


ReplyQuote