There is no denying that a TB3 eGPU can dramatically improve application processing speed as can be found by many examples on this forum. Gaming on the side then being an extra bonus.
Question then, if your primary use case is gaming is a TB3 eGPU worth it? In this build we compare performance between a M.2 versus TB3 eGPU attached to a single system (identical specs, OS, system build) to help answer that question. The M.2 eGPU being a maximum performance reference before Thunderbolt 3 transport layer encoding is applied.
System specs (model inc screen size, CPU, iGPU, dGPU, operating system user which eGPU is being used)
- 2015 15″ Dell Precision 7510
- Intel i7-6820HQ (max 3.6Ghz 4-core with unlocked multis)
- 16GB RAM (2x8GB, dual channel)
- Intel HD530 iGPU + Nvidia Quadro M1000M dGPU
- 2.5″ SATA drive bay for primary boot drive, though OS build used was booted off a WinToGo USB NVME SSD
- M.2 (NVME SSD) slot is free for M.2 eGPU use
- Windows 10 Professional 1803
eGPU hardware (eGPU enclosure, video card, any third-party TB3 cable, any custom mods)
- ASUS Nvidia GTX 1080 Ti Turbo, flashed with EVGA FTW3 vbios. Has 6P + 8P PCIe power.
- ADT-Link R43SG (50cm) M.2 eGPU adapter
- 2 x 220W Dell DA-2 AC adapters 1st powers 75W slot + 75W 6P with cable (1). 2nd powers 150W 8P using cable (2)
– cable (1) a 8P to 6P+8P power splitter cable supplied with the ADT-Link R43SG
– cable (2) a 6P to 8P PCIe extender with power-on split-pin/paper-clip as shown here and here
- M.2 protect board – protects M.2 slot being damaged from repeated insertions
GPU power cabling
Installation steps (what did you do to get it all going?)
The M.2 and TB3 eGPUs are functionally equivalent in use aspects apart from the latter supporting hotplugging and of course performance discussed in the Benchmarks section below.
Nvidia Optimus accelerated internal LCD mode engages for both. Neither interface saw any system instability.
If considering Hackintoshing, a M.2 eGPU is detected and appears on the PCIe BUS and with correct drivers can be made to work. Eg: this GTX 1080 Ti has macOS 10.13.6 or earlier Nvidia web drivers. A TB3 eGPU isn’t so easy. TB3 on a Hackintosh requires SSDT/DSDT mods to emulate the Mac’s native TB3 interface as has been done on Dell XPS 95xx systems. While TB3 peripherals work there I’ve yet to see working TB3 eGPU.
Benchmarks (all at FHD 1920×1080 )
|Dell Precision 7510||M2-ext||TB3-ext||difference||M2-int||TB3-int||difference|
|Forza Horizon 4||105||70||-33.3%||90||64||-28.9%|
|Far Cry 5||93||89||-4.3%||86||78||-9.3%|
|AIDA64 write (MB/s)||2886||2160||-25.2%|
|hwinfo64 PCIe port||port 9||port 5|
The first two results show a > -22% difference. Why?
Even though this site refers to TB3 as 32Gbps-TB3 (it’s electrical link), Intel does disclose the TB3<->TB3 link is 22Gbps. So then let’s gather those same benchmarks with greatest difference and run them on a 16Gbps-M2 interface. We’d expect then “22Gbps” TB3 to outperform 16Gbps-M2.
|Dell Precision 7510||32Gbps-M2||32Gbps-TB3||difference||16Gbps-M2||32Gbps-TB3||difference|
|Forza Horizon 4||105||70||-33.3%||74||70||-5.4%|
|AIDA64 write (MB/s)||2886||2160||-25.2%||1461||2160||+47.8%|
|hwinfo64 PCIe port||port 9||port 5||port9||port 5|
Forza 4 has a significant 33.3% decrease in FPS, beyond what we’d expect given TB3’s 25.2% decrease in AIDA64 bandwidth (H2D). Oddly, even a 16Gbps-M2 interface with supposedly less bandwidth outperforms TB3. TB3 should be just an encode/decoding PCIe transport pair so why are we seeing such a performance decrease over TB3?
We find clues as to why by using the bandwidthTest.exe tool included DaVinci Resolve, running commandline below while with the eGPU connected on a 32Gbps-M2, 32Gbps-TB3 and 16Gbps-M2 interface.
C:\Program Files\Blackmagic Design\DaVinci Resolve\bandwidthTest.exe --htod --mode=shmoo --csv > out.csv
We then review the gathered bandwidth versus block size information where we see:
- TB3 reduces the 32Gbps PCIe bandwidth anywhere from 25% to 63.5%.
- The greatest reduction occurs on small block sizes. 2kb seeing the greatest 63.5% reduction.
- TB3 is outperformed by 16Gbps-M2 up to 8kb block size, only matching performance at 16kb block size and outperforming it thereafter.
- It appears that Forza 4 is bandwidth bound and utilizes small block transfer sizes.
- The min 25% reduction occurs at a block sizes of 200kb and greater -> TB3 is performance optimized for large block sizes.
Comments (eg: how has the eGPU improved your workflow or gaming)
Is a TB3 eGPU worth if for gaming? We’ve shown the additional TB3 transport layer decreases x4 3.0 32Gbps bandwidth anywhere from 25% to as much as 63.5% with bandwidth bound apps/games registering this performance reduction. . The reference 32Gbps-M2 interface itself is 4 times less bandwidth than an Intel desktop.
Is the problem simply the TB3 controller being clocked too slow? Maybe. We do see that TB3 is optimized for large block sizes as used for data transfer on SSDs. Intel certainly have plenty of room to improve this small block transfer performance in TB4.
Q: How to maximize gaming performance on a notebook? From this build and performance analysis we can suggest:
- seek a notebook with a decent dGPU
- seek a candidate system offering a factory direct PCIe eGPU interface like a Alienware Graphics Amplifier port
- cobble together a eGPU using the NVME SSD’s M.2 slot, also a direct PCIe eGPU interface, like shown in this build
- obtain a TB3 eGPU now knowing it’s performance limitations on bandwidth-bound games/apps
Share this Post