PDF guide and patches for making Linux v5.3 kernel to work with Thunderbolt 3 ad...
 
Notifications
Clear all

PDF guide and patches for making Linux v5.3 kernel to work with Thunderbolt 3 add-in card.  

 of  2
  RSS

4GDecodingAbove
(@4gdecodingabove)
New Member
Joined: 1 year ago
 

Hi

@karatekid430,

I have macbook air mid 2015 i7 and 8 gigs of ram (i have clover for duals boot)

Step 1)

when I boot on windows 10 with all drivers from intel (mpss 4.4.0): I have the resource problem (error 12 and message driver can't load).
I did the dsdt override manipulation but I need to do a 64 bits addressing if I understand well.
Can we modify the below part for a 64bits address (I tried and windows reboot in loop via bootcamp)?

QWordMemory (ResourceProducer, PosDecode, MinFixed, MaxFixed, Cacheable, ReadWrite,
0x0000000000000000, // Granularity
0x0000000C20000000, // Range Minimum, set it to 48.5GB
0x0000000E0FFFFFFF, // Range Maximum, set it to 56.25GB <------------------------------------------- can i try to modify this value ?
0x0000000000000000, // Translation Offset
0x00000001F0000000, // Length calculated by Range Max - Range Min.
,, , AddressRangeMemory, TypeStatic)

Step 2)

I made a test by booting on ubuntu to have more information with the lspci command and I have this :

0a:00.0 Co-processor: Intel Corporation Device 2262 (rev ca)
Subsystem: Intel Corporation Device 7504
Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Interrupt: pin A routed to IRQ 0
Region 0: Memory at c1700000 (32-bit, non-prefetchable) [disabled] [size=256K] Region 2: Memory at <unassigned> (64-bit, prefetchable) [disabled] <----------------------- (probleme here)
Capabilities: <access denied>

0a:00.1 System peripheral: Intel Corporation Device 2264 (rev ca)
Subsystem: Intel Corporation Device 7504
Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Interrupt: pin B routed to IRQ 0
Region 0: Memory at c1740000 (32-bit, non-prefetchable) [disabled] [size=8K] Capabilities: <access denied>

I took note of the modification with the 5.6 RC kernel and I would like to know if I update my ubuntu with this kernel and by playing on the parameters if it will solve my problem?

And can we modify the uefi to have the same thing like "4G decoding above" on macbook air?

 

For information, i have a akitio egpu box with adapter to TB2 (TB3-->TB2)

I was able to use quadro M6000 and MSI GEFORCE  GTX 1050TI GAMING X 4G on it (i try to test a NVIDIA K80 24G and i have like the xeon phi "resource error 12").

Thanks

 

 

This post was modified 1 year ago

To do: Create my signature with system and expected eGPU configuration information to give context to my posts. I have no builds.

.

ReplyQuote
andrejpodzimek
(@andrejpodzimek)
New Member
Joined: 5 months ago
 
Posted by: @karatekid430

Oh, I forgot to mention - all of my kernel patches have been accepted into mainline with Linux v5.6, with the first RC released about one month ago. For now, Linux v5.6 is still a release candidate and not stable. But if you are comfortable using a RC, then install it with Gigabyte GC-TITAN RIDGE and do in kernel command line:

pcie_ports=native pci=assign-busses,hpbussize=0x33,realloc,hpmmiosize=128M,hpmmioprefsize=512M,nocrs

Thanks a lot for this piece of advice. This helped me with my ASRock X570 Creator connected to a Razer Core X Chroma eGPU with an NVidia Quadro P5000 in it. I couldn't make it work with the desktop. It did work with Linux laptops without issues, but not with the ASRock X570 Creator desktop. I asked about it in ArchLinux, NVidia and Razer forums and contacted both ASRock's and Razer's technical support, but to no avail. Eventually this did the trick:

pcie_ports=native pci=assign-busses,hpbussize=0x33,realloc,hpmmiosize=128M,hpmmioprefsize=16G

What I'm stil unsure about:

  • Whether and why pcie_ports=native is necessary; I have all of AER, IOMMU and SR-IOV enabled in the UEFI setup anyway, so no extra overrides should be needed.
  • Whether a higher value in hpmmiosize=128M could improve performance or what the best value is, in general. If all (?) devices must fit in this 4G address space, setting it too big may render some of them inoperable, right?
  • What hpmmioprefsize=16G should be (and to what extent it matters); I've just picked what happens to be the GPU's RAM size, but perhaps this is not directly related. It works with 512M as well.
  • Whether nocrs is needed. My machine won't boot if I add nocrs to pci=..., because the kernel can't talk to SATA controllers and drives and freezes forever trying to do so. The eGPU works fine without nocrs.
This post was modified 5 months ago

Razer Core X Chroma + NVidia Quadro P5000


ReplyQuote
andrejpodzimek
(@andrejpodzimek)
New Member
Joined: 5 months ago
 
  • What hpmmioprefsize=16G should be (and to what extent it matters); I've just picked what happens to be the GPU's RAM size, but perhaps this is not directly related. It works with 512M as well.

This^^^ is actually quite confusing upon closer look. I've just tried to experiment with the numbers and noticed that I was getting no space for errors in dmesg most of the time.

  • With hpmmiosize=256M,hpmmioprefsize=64G there were lots of errors. Here's a full dmesg and the part of dmesg after the eGPU is plugged in. But the eGPU worked.
  • With hpmmiosize=128M,hpmmioprefsize=16G there were far fewer such errors, affecting only  BAR 15 this time. The eGPU worked.
  • With hpmmiosize=128M,hpmmioprefsize=512M the errors looked the same as in the case above. The eGPU worked.
  • With hpmmiosize=128M,hpmmioprefsize=256M the errors went away. This appears to be a threshold of some sort. But the eGPU did not work.

It looks like hpmmiosize doesn't have much of an impact (tried 32M, 64M, 128M, 256M and it didn't make an obvious difference), but hpmmioprefsize renders the eGPU inoperable below 512M, yet causes no space for errors at and above 512M.

Admittedly, I have no clue how serious the no space for errors actually are and haven't spotted any malfunction among my PCIe devices. However, I assume that there is a performance penalty if the MMIO setup fails. (?)

All in all, the eGPU works now, but there's still something wrong with this configuration.

This post was modified 5 months ago

Razer Core X Chroma + NVidia Quadro P5000


ReplyQuote
onqbau
(@onqbau)
New Member
Joined: 3 weeks ago
 

Was very excited to find this thread.  Acquired a Gigabyte GC-Titan Ridge 2.0 card and followed the directions. 

Thank you @karatekid430

Working with a dual CPU Supermicro X10DAC.

Running Linux 5.10.31 kernel (Gentoo), on boot I see errors like this:

  • thunderbolt 0000:83:00.0: interrupt for TX ring 0 is already enabled
    • details, call trace, etc
  • thunderbolt 0000:83:00.0: interrupt for RX ring 0 is already enabled
    • details, call trace, etc
  • thunderbolt 0000:83:00.0: ICM firmware is in wrong mode: 15

There is a lot more detail which I can post but I wanted to ask if anyone recognized this?

[Under Windows the card seems to work correctly.  I plug in a test device and it is recognized.]

 

 

To do: Create my signature with system and expected eGPU configuration information to give context to my posts. I have no builds.

.

ReplyQuote
karatekid430
(@karatekid430)
Estimable Member
Joined: 4 years ago
 

@onqbau, if you followed the directions, you would have build a Linux v5.3 kernel. Since v5.6, all my changes have been added to mainline, so you do not need to build one again.

I would update to the latest v5.11 or even try v5.12-rc8, just in case. But these TX/RX ring errors are not something I usually see unless I am doing something insane, or as a freak occurrence. If it persists, then you might want to open up a bug card which will make its way to Mika Westerberg at Intel, but I am not sure if he will help on an unsupported system configuration. But at least it is worth a shot.

 

To do: Create my signature with system and expected eGPU configuration information to give context to my posts. I have no builds.

.

ReplyQuote
onqbau
(@onqbau)
New Member
Joined: 3 weeks ago
 

@karatekid430, tested with v5.11.15 and v5.12.0-rc8, no change in behavior.  Moving the Titan Ridge card from CPU1 PCIe bus onto the CPU0 PCIe bus solved the problem.  While the card worked under Windows on the second CPU bus it appears that Linux only supports the card on the CPU0 bus.  Thank you for the help!

 

To do: Create my signature with system and expected eGPU configuration information to give context to my posts. I have no builds.

.

ReplyQuote
 of  2