For GPU TB3 box with daisychain connector: Daisychain SSD/Ethernet after GPU, vs...
 
Notifications

For GPU TB3 box with daisychain connector: Daisychain SSD/Ethernet after GPU, vs. connect SSD/Eth. to laptop's second Thunderbolt, what's the speed difference?  

  RSS

cmm
 cmm
(@mois)
Trusted Member
Joined: 2 years ago
 

Hi,

I have a Thinkpad Yoga X1 2017/2:nd edition. Its two Thunderbolt ports are fed with four PCIe v3 lanes by the laptop, like all U-series (low-power) ultracompact laptops, the PCI comes from the PCH which is located on the CPU package - this should be generally fast apparently.

If I would have my Geforce 1080 Ti or Vega 64 (which I use to run an 8K monitor for web browsing/office/video playback use) connected via some General PCIe expansion box that is not "certified", and then connect an SSD or Ethernet Thunderbolt controller via Thunderbolt via the GPU box' daisy chain connector (daisy chain is not allowed on certified boxes, alas.), could this ever limit performance, compared to if I would connect the SSD/Ethernet on the laptop's second Thunderbolt port?

(Also if there are actually any good such GPU boxes, that also do >=60W charging of the laptop over Thunderbolt, please let me know!)

Please let me know, thanks!

Pending: Add my system information and expected eGPU configuration to my signature to give context to my posts


ReplyQuote
cmm
 cmm
(@mois)
Trusted Member
Joined: 2 years ago
 

I think I found the answer:

Two Thunderbolt 3 connectors on a laptop are driven by one TB3 controller. The TB3 controller is connected to the PCH on the CPU, or the CPU, via x4 PCI lanes = the TB3 controller has 32gbps total bandwidth to the PCH/CPU ("upstream").

Each TB3 connector has a 22gbps bandwidth limit each ("downstream"), this is a cap in firmware from what I understand an is maintained as a matter of Intel policy - so weird.

This means if you saturate one TB3 connector with 22gbps, the controller still has another 32 - 22 = 10gbps more bandwidth left to give (downstream).

If you plug an eGPU to your laptop, it will need as much of the TB3 connector's 22gbps as it can get - because a GPU's native bandwidth in a PC is really 126gbps so already 22gbps is really narrow - and this is why you should always connect the eGPU as only device to a TB3 connector on your laptop.

(Last, note that light laptops as of 2018 have a system-global bandwidth of 32gbps, which is the CPU-PCH link which is called DMI.)

All gbps numbers above are full-duplex, as in there's 22gbps up and 22gbps down.

Daisychaining in itself does not appear to be come with any notable overhead anymore, as of Thunderbolt 3, please correct me if I'm wrong.

Pending: Add my system information and expected eGPU configuration to my signature to give context to my posts


ReplyQuote
cmm
 cmm
(@mois)
Trusted Member
Joined: 2 years ago
 

Err, actually after talking to one more guy, I realize that the which total bandwidth a Thunderbolt 3 controller has, not is clear from Intel's materials.

For this reason, I contacted Intel via https://thunderbolttechnology.net/contact  now and posted this question:

Subject: Unclarity about total PCIe data bandwidth through dual-port TB3 controller

"Hi!

Intel's dual-port Thunderbolt 3 controllers are JHL6540 and DSL6540.

Upstream (toward the CPU), they are connected using x4 PCIe lanes, meaning 32gbps bandwidth.

 

Meanwhile, one Thunderbolt 3 port (connector) will have only 22 gbps of data bandwidth, i.e. the Thunderbolt 3 controller's downstream bandwidth for one Thunderbolt 3 port is 22gbps (Intel says this here https://thunderbolttechnology.net/sites/default/files/Thunderbolt3_TechBrief_FINAL.pdf).

 

I wonder, for the dual-port Thunderbolt 3 controller, is the controller's total supported PCIe data bandwidth 22 gbps, or is it 32 gbps, so that you could run 32gbps total data over the two Thunderbolt 3 ports, e.g. 16gbps on first and 16gbps on the second, 22 gbps on the first and 10 gbps on the second, etc. .

 

Or, is the total PCIe data bandwidth for your dual-port Thunderbolt 3 controller 22gbps accross both ports, so if the first port uses 16gbps then there's only 6gbps available for the second port?

 

Please clarify, thanks!"

Also posted this question on  https://www.reddit.com/r/eGPU/comments/92mabq/is_intels_thunderbolt_3_dual_port_controllers/ .

Pending: Add my system information and expected eGPU configuration to my signature to give context to my posts


ReplyQuote
cmm
 cmm
(@mois)
Trusted Member
Joined: 2 years ago
 

Nando commented to me by PM,

"It would stand to reason to be 32Gbps PCIe for both ports. That's because it seems Intel reserves 10Gbps on the second port for guaranteed USB-C Gen2 data, which when used for PCIe would share traffic with the first port up to the combined 32Gbps limit of the x4 3.0 link.

However, we do not have a definitive answer there other than if someone performs benchmarks to figure it out. So worth Intel getting back to you on that one."

Pending: Add my system information and expected eGPU configuration to my signature to give context to my posts


ReplyQuote
goalque
(@goalque)
Noble Member Admin
Joined: 3 years ago
 

Dual TB3 configuration may suffer form reduced H2D performance, and therefore all eGFX Windows certified products can only have one TB3 port. By a firmware update, it's possible to set the other TB3 port for USB only but this has its own drawbacks.

automate-eGPU EFIapple_set_os.efi
--
Mid 2015 15-inch MacBook Pro eGPU Master Thread


ReplyQuote
cmm
 cmm
(@mois)
Trusted Member
Joined: 2 years ago
 

@goalque, by "dual TB3" do you mean when one TB3 controller has two ports, or when a TB3 device has a daisy-chain port?

I now also posted this question on Intel's communities page,  https://communities.intel.com/message/557882#557882 and  https://communities.intel.com/message/557884#557884 .

Pending: Add my system information and expected eGPU configuration to my signature to give context to my posts


ReplyQuote
goalque
(@goalque)
Noble Member Admin
Joined: 3 years ago
 
Posted by: mois

@goalque, by "dual TB3" do you mean when one TB3 controller has two ports, or when a TB3 device has a daisy-chain port?

I now also posted this question on Intel's communities page,  https://communities.intel.com/message/557882#557882 .

By "dual TB3" I meant the PCIe certified enclosure. If the second port isn't in use, you should be able to reach 22XX MiB/s (H2D), but not always. AFAIK, it also depends on the host computer's TB3 firmware.

Daisy chaining is not recommended

"2. For optimal performance, eGPUs should be connected directly to your Mac and not daisy-chained through another Thunderbolt device or hub."

https://support.apple.com/en-us/HT208544

automate-eGPU EFIapple_set_os.efi
--
Mid 2015 15-inch MacBook Pro eGPU Master Thread


ReplyQuote
cmm
 cmm
(@mois)
Trusted Member
Joined: 2 years ago
 

@goalque,

First just for clarity on numbers, 22XXMB/sec - say 2299MB/sec - is 18392mbps, which is 17.96 gbps. yeah that's pretty close to 22gbps.

I see what you meant to say now, that the aggregate bandwidth in one Thunderbolt 3 daisy chain, is 22XX MB/sec (18gbps). I knew that, that can be read out of  https://thunderbolttechnology.net/sites/default/files/Thunderbolt3_TechBrief_FINAL.pdf -

What I wanted to ask you was this - you know, the Thunderbolt 3 controller on the laptop, it drives TWO Thunderbolt 3 ports.

If I connect to EACH of the TWO Thunderbolt 3 ports, a device that is constantly pushing 2000 MB/sec (16gbps) to the CPU, will all those 4000MB/sec (32gbps) reach the CPU or will the dual-port Thunderbolt 3 controller in the laptop cap the bandwidth?

Similarly, say that my CPU is trying to push 2000 MB/sec (16gbps) to the first Thunderbolt 3 device which is on the first Thunderbolt 3 port, and another 2000 MB/sec (16gbps) on the second Thunderbolt 3 device which is on the second Thunderbolt 3 port, will the dual-port Thunderbolt 3 controller be able to process all that data, or would it cap the bandwidth so that the total amount of data reaching my two Thunderbolt 3 devices would be less than the 2000 + 2000 = 4000 MB/sec (16 + 16 gbps = 32gbps) that I wanted them to receive?

 

Pending: Add my system information and expected eGPU configuration to my signature to give context to my posts


ReplyQuote
cmm
 cmm
(@mois)
Trusted Member
Joined: 2 years ago
 

Here is one more aspect of daisy-chaining vs using an eGPU box (in an eGPU enclosure that does not have any daisy chain port):

https://www.reddit.com/r/eGPU/comments/92ndul/do_egpu_boxes_no_daisychain_have_special_firmware/

Please share your thoughts, thanks!

Pending: Add my system information and expected eGPU configuration to my signature to give context to my posts


ReplyQuote
joevt
(@joevt)
Noble Member
Joined: 2 years ago
 
Posted by: mois

If I connect to EACH of the TWO Thunderbolt 3 ports, a device that is constantly pushing 2000 MB/sec (16gbps) to the CPU, will all those 4000MB/sec (32gbps) reach the CPU or will the dual-port Thunderbolt 3 controller in the laptop cap the bandwidth?

Similarly, say that my CPU is trying to push 2000 MB/sec (16gbps) to the first Thunderbolt 3 device which is on the first Thunderbolt 3 port, and another 2000 MB/sec (16gbps) on the second Thunderbolt 3 device which is on the second Thunderbolt 3 port, will the dual-port Thunderbolt 3 controller be able to process all that data, or would it cap the bandwidth so that the total amount of data reaching my two Thunderbolt 3 devices would be less than the 2000 + 2000 = 4000 MB/sec (16 + 16 gbps = 32gbps) that I wanted them to receive?

 

Aren't those two scenarios the same?

PCIe 3.0 x4 max is 31.5 Gbps (3938 MB/s). PCIe protocol overhead reduces that. Thunderbolt port max is 22 Gbps (2750 MB/s). I don't know if Intel caps based only on actual data transmitted (allowing the CPU to see 2750 MB/s) or if they cap on the entire PCIe protocol reducing the bandwidth to less than 2750 MB/s. The latter may be true as I've never seen anything achieve 2750 MB/s over Thunderbolt. 

If you can get around 3270 MB/s total (from a RAID, depending on OS/benchmark used) then that would prove there's no controller cap. Compare with PCIe 3.0 x4, 2.0 x8, or 1.0 x16 NVMe software RAID performance numbers.

Mac mini (2018), Mac Pro (Early 2008), MacBook Pro (Retina, 15-inch, Mid 2015), GA-Z170X-Gaming 7, Sapphire Pulse Radeon RX 580 8GB GDDR5, Sonnet Echo Express III-D, Trebleet Thunderbolt 3 to NVMe M.2 case


ReplyQuote