Unable to diagnose reason for bad performance with Ubuntu AMD eGPU setup
First post here - didn't see a thing anywhere that said "read this before you post", so I hope this is the right place, please link me to posting rules if I missed it!
So, getting right down to it, here's my setup:
- Dell Latitude 5420 - this one
- CPU: i7-1185G7
- RAM: 16GB
- SSD: External SSD plugged into USB C port (Sabrent USB C NVMe Enclosure with a Sabrent Rocket NVMe SSD in it)
- eGPU: Razer Core X Chroma
- GPU: AMD Radeon RX 5700XT (Gigabyte, if that matters)
- OS: Ubuntu 20.04.3 LTS
- Displays: 2 x 1920 x 1080 @ 60hz connected to eGPU, 1 x 2560 x 1440 @ 144hz connected to eGPU, 1 x 1920 x 1080 @ 60hz (built in, laptop display)
Brief explanation and backstory for my setup:
TLDR; Work gave me a laptop that was more powerful than my 8 year old gaming desktop, and ushered me along into my transition to maining Linux for my personal PC.
I had a gaming desktop PC that is coming up to 8 years old now (i7-4770K, 12GB DDR3). During the pandemic, my work provided all of us with laptops, I managed to blag one of the beefier ones, even though we predominantly work in hosted desktop. I was astounded at how much snappier this laptop felt compared to my desktop, and I decided to run some CPU benchmarks, and was absolutely blown away that a 28W TDP laptop processor beat my >84W TDP desktop CPU (mild OC) in every single test. I've also been wanting to transition to mainly using Linux, to get rid of the sheer amount of garbage Windows seems to come with these days, so I bit the bullet. The laptop, belonging to my work, is enrolled into InTune and Autopilot, meaning it's basically locked to being a Windows machine, and the main SSD is Bitlocker encrypted. My hand was a little forced here; I wanted to install Arch Linux as I really like the flexibility of it, however, I couldn't work out how to create boot media that worked with secure boot. Due to the nature of the setup of the rest of the laptop, if I wanted to boot back into Windows, I'd have to go into the BIOS and enable secure boot each time, which I didn't want to do. So I went for Ubuntu, and also found out that Ubuntu's eGPU support was really decent, so thought I'd give it a go.
My main issue is that the performance I'm getting out of the eGPU is far below expectations, and I can't help but feel like it's related to some of the other issues I've been experiencing. My personal hunch (after having read something similar elsewhere) is that the thunderbolt bus on my laptop is oversaturated. I thought rather than explain every issue I'll bulletpoint them below, and if anyone would like to know more, I can provide more info. Happy to try pretty much anything!
- eGPU performance is bad
- Much larger drops in performance than expected from using eGPU
- Unigine Heaven Benchmark score for my setup (Using default Extreme preset 1600x900 windowed):
- Time: 260.682
- Frames: 12428
- FPS: 47.675
- Min FPS: 24.0568
- Max FPS: 108.607
- Score: 1200.93 (This should be closer to 2000)
- eGPU performance fluctuates wildly
- Sometimes I'll run a game and it'll run great (high framerate, no stutter, infrequent drops), and other times I'll run the same game and it will crawl
- My primary display flickers like crazy
- This could just be an issue with my display - open to all suggestions. It's an Asus TUF VG32V
- Video playback is poor
- If I try and play 4k video in Firefox from YouTube, it stutters awfully. 1080p is okay but still noticeably worse than I would expect
- Something is wrong with the amdgpu package, but I can't seem to remove or update it
- Installing packages from apt gives me this:
- I experience strange behaviour with USB devices connected to the eGPU
- The Razer Core X Chroma has a 1GbE port and a 4 port USB 3.1 hub built into it
- USB audio devices get choppy, and plugging another hub into it makes the issue even worse, if the devices are even recognised
- The Ethernet connection is mostly stable, but frequently spikes in latency (Which don't happen when I use the Ethernet port built into the laptop)
My prediction based on this behaviour is that I'm saturating the thunderbolt connection, but from what I can tell, there's no way to monitor the "usage" of the connection, so I have no idea what to try.
If anyone is willing to help me out, has any suggestions or would like to know any more information, or even just wants to share their two cents, I am all ears!
Thanks for taking the time
Update: Closing the lid improves things dramatically. I just ran Unigine again with the lid closed, and got this score:
- Time: 260.667
- Frames: 16094
- FPS: 61.7416
- Min FPS: 19.5901
- Max FPS: 169.236
- Score: 1555.27
The video playback is also smooth the majority of the time. However, in terms of the performance fluctuating, when I launch a game it'll run really well for about 2 minutes, then really slow down to about a quarter of the speed, and it doesn't appear to be a throttling issue as it remains well within operating temperatures (under 80c)
It is expected that a Windowed game will run better with the internal display disabled or lid closed. Exclusive Fullscreen games on the eGPU display should be the same regardless.
I've had the stuttery video issue in firefox, I fixed it with one of the commands in about:config. I think it was gfx.webrender.all?
The amdgpu package error look like you maybe tried to install the pro driver package from amd.com? If so I would recommend removing it with the amdgpu-pro-uninstall command.
Expected that the USB ports on the Core X might be finnicky/drop out.
You say you're plugging the SSD into the second USB-C/TB4 port? This would mean the eGPU and SSD are sharing bandwidth because both go to one controller. I would plug it into the USB-A ports on the other side. Yes it's a little slower but I don't think it will share bandwidth with the eGPU then.
I've seen that you post quite a lot on here, so thank you for taking the time to respond to me as well!
That's right, I'm using Xorg with egpu-switcher, sorry should have specified. I've also given Weyland a go and found that it seems to work almost as well if not better.
The full screen applications I'm running are mostly games on Steam using the Proton compatibility tool, and that's a very YMMV way of running games in and of itself. I also don't know if it's just the way some games are, but sometimes it seems as though the "fullscreen" option appears to be closer to a "borderless windowed" mode, so I suppose that could be contributing.
Firefox seems to be okay for the moment, but I'll give it a try if I notice it again - thanks for the tip!
I think you're right about me trying to install amdgpu-pro from amd.com, that does ring a bell - I'll try running that uninstall command and hopefully it won't break anything 😉
Shame about the ports on the Razer Core being finicky - do you know what causes it at all? Something to do with the USB polling using up time on the Thunderbolt interface, or is it more of an issue with the controller built into the Razer Core itself?
You're right about the SSD and I had a sneaking suspicion such a thing could be happening - I'm not sure what the internal topology of the laptop is in terms of the ports and their controllers, but it would make sense the pair of USB C / TB4 ports are connected to the same controller, and as such it's probably also eating away at the bandwidth having it connected there. There are two USB A ports on the other side are supposedly USB 3.1 and support 10 Gbps - I'm waiting for a USB 3.1 A to C cable in the post and I'll try it from that side, and then I'll invest in a larger USB hub for all of my other peripherals with the other port, so that I can take them out of the core. It seems we are still a little while away from the 'one plug dream' if we have lots of peripherals, but no matter! I'm more interested in getting it working than having it use a single cable. I only move it once a week as it is. Hopefully the ports will hold up over the next several years!
I'll give all this a go and let you know how I get on.