10-Nov-2017: Powercolor Devil Box has a H2D firmware fix available here.
25-Aug-2017: The Aorus Gaming Box has a H2D firmware fix available here.
24-April-2017: Asus ROG XG Station 2 has a H2D firmware fix available here.
19-April-2017: Mantiz Venus will be shipped with H2D fixed firmware fix here, updated here.
6-April-2017: AKiTiO Node has released a firmware fix for this issue available here.
As of 30-March-2017, all the TI83-based TB3 enclosures in the buyer's guide, listed below, have been tested and found to be delivering half the critical host-to-device (H2D) bandwidth of their 22Gbps Intel's TB3 specification , 1100MiB/s or 9.22Gbps.
- AKiTiO Node
- Mantiz Venus
- Power Color Devil Box (new releases are TI83, older ones are TI82)
- Asus ROG XG Station 2
This brings their real-world performance down to less than TB2 levels (1100MiB/s here for TB3 vs 1300MiB/s for TB2). It has a direct impact on bandwidth-dependent gaming FPS when using an external LCD attached to your eGPU. Performance analysis is in the appendix.
What sort of gaming FPS improvement could be had if this was half-H2D bandwidth issue was fixed?
Consider a H2D of 1100MiB/s is 9.22GBps, which is close to x4 1.1 8Gbps. Below is a summary of the games seeing significant FPS increase with a x4 1.1 -> x8 1.1 bandwidth increase (approx. equivalent to the H2D issue being fixed). As found at techpowerup's GTX1080 PCIe Scaling test @ FHD:
* Hitman: +46% (63.7->93.1) here
* Far Cry Primal: 43% (51.4->73.9) here
* Just Cause 3: +42% (73.7->104.7) here
* Total War Warhammer : +39% (43.2->58.6) here
* Assassin's Creed Syndicate: +29% (55.1->70.9) here
* BF1: +26% (97.8->123.4) here
How to solve this half-H2D bandwidth performance problem?
1. Purchase a TI82-based enclosure
Do note that while the TI82-based AKITiO Thunder3 gets a better H2D of 1700MiB/s here and even 2200MiB/s here it's still less than the TI82-based Razer Core that so far hasn't given us a H2D benchmark less than 2200MiB/s.
2. Wait for the enclosure vendors to fix the problem with new firmware
The affected TI83-based TB enclosures have a better feature set than the TI82-based enclosures so may be worth the wait to get this performance issue fixed. At eGPU.io we've notified the following eGPU.io reps of this problem:
- AKiTiO eGPU.io rep @DanKnight
- Mantiz eGPU.io rep @Mymantiz_John
As Intel FW is likely the culprit, the vendors should aim for at least for 22Gbps H2D performance specced by Intel for TB3.
FYI: Intel have FW throttled TB3 down to 22Gbps. It is 32Gbps capable.
While at it, I've asked the vendor reps to ask Intel to unleash TB3 from 22Gbps to it's full 32Gbps PCIe traffic capacity. The underlying Intel system architecture can support it as discussed here. It would then help counter the large outting of TB3 eGPU underperformance based on, quite likely, this half-H2D performance issue tainting the results in the following video:
Appendix: benchmark results identifying the problem
The tested comparison systems and enclosures being:
1. A Dell Precision M7510 - TI82 Razer Core vs TI83 AKiTiO Node - H2D of 2124MiB/s versus 1126MiB/s
TI82 Razer Core results from here. |
TI83 AKiTiO Node results from here |
2. ASUS UX501VW, TI82 AKITiO Thunder3 vs TI83 AKiTiO Node: H2D 2081MiB/s vs 1144 MiB/s
TI82 AKiTiO Thunder3 results from here |
TI83 AKiTiO Node results from here |
3. 2016 13" Macbook from @Goalque email - TI82 Devil Box vs TI83 AKiTiO Node - H2D 1625MiB/s vs 1108MiB/s
Above: TI82 Powercolor Devil Box |
Above: TI83 AKiTiO Node |
@Goalque verifying these results using Matlab which confirms CUDA-Z is working properly:
Above: TI82 Powercolor Devil Box Peak send speed is 1.75874 GB/s |
Above: TI83 AKiTiO Node Peak send speed is 1.25196 GB/s |
The 2016 13" MBP + Powercolor Devil Box enclosure gets H2D 1625MiB/s, compared to the Node's 1108MiB/s. Not quite the top end 2200MiB/s H2D we've seen which suggests Apple may have throttled the TB3 notebook firmware OR the Devil Box is also not delivering max performance. To confirm the latter would require PCIe TB3 SSD comparative benchmarking to see if it can hit 2200MiB/s.
4. MSI GS63VR. Another TI83-based AKiTiO Node half-H2D result here.
5. TI83-based Mantiz Venus is half-H2D affected here .
6. TI83-based Asus ROG XG Station 2 is half-H2D affected here.
7. TI83-based AKiTiO Node is half-H2D affected with an AMD RX470 here as tested with OpenCL benchmarking.
eGPU Setup 1.35 • eGPU Port Bandwidth Reference Table
2015 15" Dell Precision 7510 (Q M1000M) [6th,4C,H] + GTX 1080 Ti @32Gbps-M2 (ADT-Link R43SG) + Win10 1803 [build link]
HI Nando4,
Is my result normal?
Akitio node, late 2016 macbook pro 13' touch bar, GTX 1050TI
Thunderbolt Bus 1:
Vendor Name: Apple Inc.
Device Name: MacBook Pro
UID: 0x0001533671212601
Route String: 0
Firmware Version: 15.14
Domain UUID: E783E973-F7B6-2F66-BBCB-1BEE04A41622
Port:
Status: No device connected
Link Status: 0x101
Speed: Up to 40 Gb/s x1
Current Link Width: 0x1
Receptacle: 1
Link Controller Firmware Version: 0.17.0
Port:
Status: No device connected
Link Status: 0x101
Speed: Up to 40 Gb/s x1
Current Link Width: 0x1
Receptacle: 20
Link Controller Firmware Version: 0.17.0
|
|
question is:
1. this can also be tested in Heaven or Valley?
2. this can be felt during gaming?
3. is there a second way to know the thunderbolt speed beside CUDA?
4. is it related to the AKITIO node cable? I would try the Belkin cable
5. do we have the right to ask Apple to correct that, if it's Apple restriction, because it's not as what they promised
Late Macbook Pro 2016 13' touch bar + AKITIO node + GTX 1050TI 4G Windows
HI Nando4,
Is my result normal?
Akitio node, late 2016 macbook pro 13' touch bar, GTX 1050TI
You too are seeing underperforming results. Confirmed to be an enclosure issues by @Goalque's results (new addition) to the opening post showing a Powercolor Devil Box delivering 1625MiB/s H2D on a 2016 13" MBP whereas a Node delivers 1108MiB/s.
eGPU Setup 1.35 • eGPU Port Bandwidth Reference Table
2015 15" Dell Precision 7510 (Q M1000M) [6th,4C,H] + GTX 1080 Ti @32Gbps-M2 (ADT-Link R43SG) + Win10 1803 [build link]
It's worth noting the TI82 equipped enclosures so far have yielded higher Host to Device numbers. Could this be a firmware restriction on the TI83 controller?
• external graphics card builds
• best laptops for external GPU
• eGPU enclosure buyer's guide
2020 14" MSI Prestige 14 EVO [11th,4C,G] + RTX 3080 @ 32Gbps-TB4 (AORUS Gaming Box) + Win10 2004 [build link]
Deleted
To do: Create my signature with system and expected eGPU configuration information to give context to my posts. I have no builds.
.This has nothing to do with macOS. The results are the same on Windows. The firmware version of my TI82 equipped Devil Box might explain the ~500MiB/s difference.
automate-eGPU EFI ● apple_set_os.efi
Mid 2015 15-inch MacBook Pro eGPU Master Thread
2018 13" MacBook Pro [8th,4C,U] + Radeon VII @ 32Gbps-TB3 (ASUS XG Station Pro) + Win10 1809 [build link]
My results on Akitio Node to tb2 MBP mid 2015 m370x CUDA-Z Report ============= Version: 0.10.251 64 bit http://cuda-z.sf.net/ OS Version: Windows x86 6.2.9200 Driver Version: 378.92 Driver Dll Version: 8.0 (6.14.13.7892) Runtime Dll Version: 6.50 Core Information ---------------- Name: GeForce GTX 980 Ti Compute Capability: 5.2 Clock Rate: 1240.5 MHz PCI Location: 0:11:0 Multiprocessors: 22 (2816 Cores) Threads Per Multiproc.: 2048 Warp Size: 32 Regs Per Block: 65536 Threads Per Block: 1024 Threads Dimensions: 1024 x 1024 x 64 Grid Dimensions: 2147483647 x 65535 x 65535 Watchdog Enabled: Yes Integrated GPU: No Concurrent Kernels: Yes Compute Mode: Default Stream Priorities: Yes Memory Information ------------------ Total Global: 6144 MiB Bus Width: 384 bits Clock Rate: 3505 MHz Error Correction: No L2 Cache Size: 48 KiB Shared Per Block: 48 KiB Pitch: 2048 MiB Total Constant: 64 KiB Texture Alignment: 512 B Texture 1D Size: 65536 Texture 2D Size: 65536 x 65536 Texture 3D Size: 4096 x 4096 x 4096 GPU Overlap: Yes Map Host Memory: Yes Unified Addressing: Yes Async Engine: Yes, Bidirectional Performance Information ----------------------- Memory Copy Host Pinned to Device: 1259.09 MiB/s Host Pageable to Device: 1159.32 MiB/s Device to Host Pinned: 1341.19 MiB/s Device to Host Pageable: 1193.47 MiB/s Device to Device: 107.563 GiB/s GPU Core Performance Single-precision Float: 6869.32 Gflop/s Double-precision Float: 225.315 Gflop/s 64-bit Integer: 359.719 Giop/s 32-bit Integer: 2020.52 Giop/s 24-bit Integer: 1443.23 Giop/s Generated: Mon Mar 27 22:00:55 2017
To do: Create my signature with system and expected eGPU configuration information to give context to my posts. I have no builds.
.