Ubuntu 19.10 (KDE) freezes some time after eGPU is disconnected (RTX 2070)
 
Notifications
Clear all

Ubuntu 19.10 (KDE) freezes some time after eGPU is disconnected (RTX 2070)  

  RSS

Nikolay Makhotkin
(@nikolay_makhotkin)
New Member
Joined: 5 months ago
 

Hi everyone! 

I have Lenovo X1 Carbon 6th Gen (i7 8550u + 16GB RAM). Recently I bought Aorus Gaming Box with RTX 2070 on board, then setup it easily on Windows 10 and Linux kubuntu 19.10 without any problem. 

I use it for gaming on Windows 10 and for neural networks on Ubuntu (in development purposes for my job). 
While there are no problems in Windows 10, I am experiencing some issues with it on my Ubuntu system. All works fine when the system is booted and I connect the eGPU. The problems begin when I just disconnect it from my laptop: 

 - After disconnecting, hibernate doesn't work for some reason (the laptop goes to black screen and then it can be awaken pressing any key, it is even not like suspend)
 - After disconnecting, the whole system freezes some time later, only mouse works. Even other terminals by pressing Ctrl+Alt+F2 and so on don't work. However, reboot with pressing Ctrl+SysRq (PrtSc) and REISUB works (instead of hard reboot)

Note: I am not using Nvidia card to output/render on any display on Ubuntu. Just computing/CUDA/neural networks.

There is a message in syslog which might be related to disconnecting, I am not sure:

kernel: [ 2177.344903] snd_hda_codec_hdmi hdaudioC1D0: HDMI: invalid ELD buf size -1

Output of uname -a:

Linux thinkpad 5.3.0-46-generic #38-Ubuntu SMP Fri Mar 27 17:37:05 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

Output of nvidia-smi:

+-----------------------------------------------------------------------------+ 
| NVIDIA-SMI 440.64.00    Driver Version: 440.64.00    CUDA Version: 10.2     | 
|-------------------------------+----------------------+----------------------+ 
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC | 
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. | 
|===============================+======================+======================| 
|   0  GeForce RTX 2070    On   | 00000000:0A:00.0 Off |                  N/A | 
| 58%   32C    P0    36W / 175W |      0MiB /  7982MiB |      0%      Default | 
+-------------------------------+----------------------+----------------------+ 
                                                                               
+-----------------------------------------------------------------------------+ 
| Processes:                                                       GPU Memory | 
|  GPU       PID   Type   Process name                             Usage      | 
|=============================================================================| 
|  No running processes found                                                 | 
+-----------------------------------------------------------------------------+

I can provide more details and any useful info to solve the problem, please help.

To do: Create my signature with system and expected eGPU configuration information to give context to my posts. I have no builds.

.

ReplyQuote
nu_ninja
(@nu_ninja)
Reputable Member
Joined: 2 years ago
 

I always get kernel panics if I just unplug the egpu. There has to be some switching process that I think unloads the driver and sends an ACPI signal to the gpu. I haven't tried it yet, but I think Pop!_OS has a robust system for switching nvidia GPUs so I would try that.

Mid-2012 13" Macbook Pro (MacBookPro9,2) TB1 -> RX 460/560 (AKiTiO Node/Thunder2)
+ macOS 10.15+Win10 + Linux Mint 19.1

 
2012 13" MacBook Pro [3rd,2C,M] + RX 460 @ 10Gbps-TB1 (AKiTiO Thunder2) + macOS 10.14.4 [build link]  


ReplyQuote
Thomas Capelle
(@thomas_capelle)
New Member
Joined: 5 months ago
 

@nu_ninja

Hey, I also have an RTX2070Super with a Razer Core X for training NN connected to my XPS15 on ubuntu 19.04.

It worked straight of the box, but I need to boot with the eGPU plugged in.

I am also seeing some bottleneck on the Thunderbolt3 interface when training large models, I am never attaining 100% usage, around 85%.

I am very happy with the setup btw.

Have not tried to connect a display btw.

 

To do: Create my signature with system and expected eGPU configuration information to give context to my posts. I have no builds.

.

ReplyQuote
Nikolay Makhotkin
(@nikolay_makhotkin)
New Member
Joined: 5 months ago
 

UPD: Solved the problem with the fresh installation of Ubuntu 20.04. I guess the problem was related to tlp/power settings or kernel version. Now I have kernel 5.4 btw

To do: Create my signature with system and expected eGPU configuration information to give context to my posts. I have no builds.

.

ReplyQuote