Can't comunicate with Driver - Ubuntu on a Macbook with Nvidia eGPU
 
Notifications
Clear all

Can't comunicate with Driver - Ubuntu on a Macbook with Nvidia eGPU  

  RSS

Niklas j.
(@niklas_j_)
New Member
Joined: 1 week ago
 

I'm trying to get my eGPU to work. I have a fresh install of Ubuntu 20.04 LTS (during this I disconnected the GPU) on a Macbook Pro A1707. Then I connected it again and installed the correct driver through the Ubuntu Software Manager.But here is where it starts to be weird:

lscpi shows the GPU correctly but nvidia-smi prints the following:


NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

Also dmesg prints:

[ 110.251087] nvidia-gpu 0000:b9:00.3: enabling device (0000 -> 0002)

[ 110.430678] nvidia: loading out-of-tree module taints kernel.

[ 110.430690] nvidia: module license 'NVIDIA' taints kernel.

[ 110.430691] Disabling lock debugging due to kernel taint

[ 110.436084] nvidia: module verification failed: signature and/or required key missing - tainting kernel

[ 110.442463] nvidia-nvlink: Nvlink Core is being initialized, major device number 234

[ 110.442846] nvidia 0000:b9:00.0: enabling device (0000 -> 0003)

[ 110.442931] NVRM: This PCI I/O region assigned to your NVIDIA device is invalid:

NVRM: BAR1 is 0M @ 0x0 (PCI:0000:b9:00.0)

[ 110.442931] NVRM: The system BIOS may have misconfigured your GPU.

[ 110.442934] nvidia: probe of 0000:b9:00.0 failed with error -1

[ 110.442947] NVRM: The NVIDIA probe routine failed for 1 device(s).

[ 110.442947] NVRM: None of the NVIDIA devices were initialized.

[ 110.443041] nvidia-nvlink: Unregistered the Nvlink Core, major device number 234

[ 110.687610] nvidia-nvlink: Nvlink Core is being initialized, major device number 234

[ 110.688036] NVRM: This PCI I/O region assigned to your NVIDIA device is invalid:

NVRM: BAR1 is 0M @ 0x0 (PCI:0000:b9:00.0)

[ 110.688036] NVRM: The system BIOS may have misconfigured your GPU.

[ 110.688040] nvidia: probe of 0000:b9:00.0 failed with error -1

[ 110.688055] NVRM: The NVIDIA probe routine failed for 1 device(s).

[ 110.688055] NVRM: None of the NVIDIA devices were initialized.

Can anyone help me? If you need more information just ask. Thanks in advance!

To do: Create my signature with system and expected eGPU configuration information to give context to my posts. I have no builds.

.

ReplyQuote
nu_ninja
(@nu_ninja)
Reputable Member
Joined: 3 years ago
 

@niklas_j_

The relevent error is:

NVRM: BAR1 is 0M @ 0x0 

check out other threads about this error like this one:  https://egpu.io/forums/thunderbolt-linux-setup/tutorial-ubuntu-18-04-rtx-2080-razer-core-v1/

Mid-2012 13" Macbook Pro (MacBookPro9,2) TB1 -> RX 460/560 (AKiTiO Node/Thunder2)
+ macOS 10.15+Win10 + Linux Mint 19.1

 
2012 13" MacBook Pro [3rd,2C,M] + RX 460 @ 10Gbps-TB1 (AKiTiO Thunder2) + macOS 10.14.4 [build link]  


itsage liked
ReplyQuote
Niklas j.
(@niklas_j_)
New Member
Joined: 1 week ago
 

@nu_ninja, hi thank you for your quick answer. I tried the post you sent. However when adding the parameter to the grub config the eGPU (and Breakout Box) wasn't recognised.  Furthermore the connected LAN-Adapter wasn't detected. Also the messages in dmesg were missing. Ubuntu also wasn't able to detect the thunderbolt capabilities (Thunderbolt menu in settings). lspci however showed the Thunderbolt connections but without the eGPU.

 

To do: Create my signature with system and expected eGPU configuration information to give context to my posts. I have no builds.

.

ReplyQuote
nu_ninja
(@nu_ninja)
Reputable Member
Joined: 3 years ago
 

@niklas_j_

Ok, to be clear, does the eGPU show up in the Thunderbolt menu in the settings normally without the kernel parameters? You can use the terminal command:

 boltctl

to show more info on the thunderbolt devices that have been authorized. Normally I'd say the first step is to make sure that the eGPU has been authorized, but I think macs have their thunderbolt security level set to off so it should always be authorized.

You might want to try a different combination of kernel parameters than the thread I linked, for example, you might or might not need pcie_ports=native or assign-busses. Based on the error you posted, I'm fairly sure that pci=nocrs,realloc are needed, but I'm not sure about the other parameters.

Mid-2012 13" Macbook Pro (MacBookPro9,2) TB1 -> RX 460/560 (AKiTiO Node/Thunder2)
+ macOS 10.15+Win10 + Linux Mint 19.1

 
2012 13" MacBook Pro [3rd,2C,M] + RX 460 @ 10Gbps-TB1 (AKiTiO Thunder2) + macOS 10.14.4 [build link]  


ReplyQuote
Niklas j.
(@niklas_j_)
New Member
Joined: 1 week ago
 

@nu_ninja, yeah right without the kernel params the eGPU get's recognized properly

boltctl without kernel params


 Sonnet Technologies, Inc. eGFX Breakaway Box 650 OC
├─ type: peripheral
├─ name: eGFX Breakaway Box 650 OC
├─ vendor: Sonnet Technologies, Inc.
├─ uuid: 00918a40-016a-0800-ffff-ffffffffffff
├─ status: authorized
│ ├─ domain: cd010000-0080-7f08-a3ea-9b04c8a3c81e
│ └─ authflags: none
├─ authorized: Fr 16 Okt 2020 05:55:12 UTC
├─ connected: Fr 16 Okt 2020 05:55:12 UTC
└─ stored: Do 15 Okt 2020 13:27:24 UTC
├─ policy: iommu
└─ key: no
 
Your recommened params (pci=nocrs,realloc) the GPU and the LAN Adapter aren't recognized.
 
Is it a Problem that I can't start the Macbook with the Gpu connected? Doesn't Linux need it to be attached already during boot? When I do so grub isn't even loaded (EFI Issue?)

 

This post was modified 5 days ago

To do: Create my signature with system and expected eGPU configuration information to give context to my posts. I have no builds.

.

ReplyQuote