Setup & Software Discussions
Vega 56 Under Kali Linux: Fatal error during GPU init
 

Vega 56 Under Kali Linux: Fatal error during GPU init  

  RSS

i0ntempest
(@i0ntempest)
Trusted Member
Joined: 2 years ago
 

Hi all, 
So yesterday I installed Kali Linux 2019.2 (4.19.0-kali5-amd64) on my 2017 iMac, hoped I can get my Vega 56 (AKiTiO Node) running on it to do some GPU based password cracking. Necessary driver packages (xserver-xorg-video-amdgpu and firmware) are preinstalled and can drive the internal Pro 560, but not the external Vega 56. Here's part of the dmesg output:

[    5.856887] ATOM BIOS: 115-D050PIL-100

[    5.856906] [drm] GPU posting now...

[    5.892023] [drm] vm size is 262144 GB, 4 levels, block size is 9-bit, fragment size is 9-bit

[    5.892035] amdgpu 0000:0d:00.0: VRAM: 8176M 0x000000F400000000 - 0x000000F5FEFFFFFF (8176M used)

[    5.892036] amdgpu 0000:0d:00.0: GART: 512M 0x000000F600000000 - 0x000000F61FFFFFFF

[    5.892044] ------------[ cut here ]------------

[    5.892045] reserve_memtype failed: [mem 0x00000000-0xffffffffffffffff], req write-combining

[    5.892050] WARNING: CPU: 4 PID: 92 at arch/x86/mm/pat.c:556 reserve_memtype+0x212/0x2c0

[    5.892050] Modules linked in: hid_generic usbhid hid sd_mod uas usb_storage crct10dif_pclmul amdkfd crc32_pclmul crc32c_intel ghash_clmulni_intel pcbc amdgpu chash gpu_sched sdhci_pci i2c_algo_bit cqhci ahci sdhci libahci aesni_intel ttm tg3 aes_x86_64 xhci_pci libata nvme drm_kms_helper crypto_simd mmc_core xhci_hcd cryptd glue_helper i2c_i801 libphy scsi_mod usbcore nvme_core usb_common thunderbolt drm video button

[    5.892064] CPU: 4 PID: 92 Comm: irq/28-pciehp Tainted: G        W         4.19.0-kali5-amd64 #1 Debian 4.19.37-6kali1

[    5.892065] Hardware name: Apple Inc. iMac18,2/Mac-77F17D7DA9285301, BIOS 175.0.0.0.0 06/17/2019

[    5.892066] RIP: 0010:reserve_memtype+0x212/0x2c0

[    5.892066] Code: 23 97 41 83 fe 05 77 08 4e 8b 04 f5 c0 82 00 97 48 8d 4b ff 48 89 ea 48 c7 c6 f0 82 00 97 48 c7 c7 28 0c 23 97 e8 88 21 01 00 <0f> 0b 41 bd ea ff ff ff e9 ef fe ff ff 41 bd ea ff ff ff e9 e4 fe

[    5.892067] RSP: 0018:ffffa8620418b9f0 EFLAGS: 00010286

[    5.892068] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff9744dda8

[    5.892068] RDX: 0000000000000001 RSI: 0000000000000096 RDI: 0000000000000247

[    5.892068] RBP: 0000000000000000 R08: 0000000000000558 R09: 0000000000000004

[    5.892069] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000001

[    5.892069] R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000000

[    5.892070] FS:  0000000000000000(0000) GS:ffff8aaaceb00000(0000) knlGS:0000000000000000

[    5.892070] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033

[    5.892070] CR2: 000055e35755d530 CR3: 000000074ae0a006 CR4: 00000000003606e0

[    5.892071] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000

[    5.892071] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400

[    5.892072] Call Trace:

[    5.892074]  io_reserve_memtype+0x58/0x120

[    5.892076]  ? _dev_info+0x6c/0x90

[    5.892077]  arch_io_reserve_memtype_wc+0x2e/0x50

[    5.892106]  amdgpu_bo_init+0x1c/0x80 [amdgpu]

[    5.892130]  gmc_v9_0_sw_init+0x2ed/0x4f0 [amdgpu]

[    5.892156]  amdgpu_device_init.cold.28+0xd7b/0x129e [amdgpu]

[    5.892174]  amdgpu_driver_load_kms+0x86/0x2d0 [amdgpu]

[    5.892183]  drm_dev_register+0x109/0x140 [drm]

[    5.892200]  amdgpu_pci_probe+0x1aa/0x230 [amdgpu]

[    5.892202]  local_pci_probe+0x41/0x90

[    5.892203]  pci_device_probe+0x189/0x1a0

[    5.892205]  really_probe+0x235/0x3a0

[    5.892206]  ? __driver_attach+0x110/0x110

[    5.892206]  driver_probe_device+0xb3/0xf0

[    5.892207]  ? __driver_attach+0x110/0x110

[    5.892208]  bus_for_each_drv+0x76/0xc0

[    5.892209]  __device_attach+0xd9/0x150

[    5.892211]  pci_bus_add_device+0x4a/0x90

[    5.892212]  pci_bus_add_devices+0x2c/0x64

[    5.892212]  pci_bus_add_devices+0x57/0x64

[    5.892213]  pci_bus_add_devices+0x57/0x64

[    5.892214]  pci_bus_add_devices+0x57/0x64

[    5.892215]  pci_bus_add_devices+0x57/0x64

[    5.892216]  pciehp_configure_device+0x93/0x130

[    5.892218]  pciehp_handle_presence_or_link_change+0x364/0x4a0

[    5.892218]  pciehp_ist+0x1b3/0x1c0

[    5.892220]  ? irq_finalize_oneshot.part.43+0x100/0x100

[    5.892220]  irq_thread_fn+0x1f/0x60

[    5.892221]  irq_thread+0xe7/0x170

[    5.892222]  ? irq_forced_thread_fn+0x70/0x70

[    5.892223]  ? irq_thread_check_affinity+0xd0/0xd0

[    5.892224]  kthread+0x112/0x130

[    5.892224]  ? kthread_bind+0x30/0x30

[    5.892226]  ret_from_fork+0x35/0x40

[    5.892227] ---[ end trace cf54c88b3f9c8a6b ]---

[    5.892228] [drm] Detected VRAM RAM=8176M, BAR=0M

[    5.892229] [drm] RAM width 2048bits HBM

[    5.892236] [drm] amdgpu: 8176M of VRAM memory ready

[    5.892237] [drm] amdgpu: 8176M of GTT memory ready.

[    5.892241] [drm] GART: num cpu pages 131072, num gpu pages 131072

[    5.892245] amdgpu 0000:0d:00.0: (-22) kernel bo map failed

[    5.892286] [drm:amdgpu_device_init.cold.28 [amdgpu]] *ERROR* amdgpu_vram_scratch_init failed -22

[    5.892302] amdgpu 0000:0d:00.0: amdgpu_device_ip_init failed

[    5.892313] amdgpu 0000:0d:00.0: Fatal error during GPU init

[    5.892324] [drm] amdgpu: finishing device.

[    5.892501] [drm] amdgpu: ttm finalized

[    5.892503] x86/PAT: irq/28-pciehp:92 freeing invalid memtype [mem 0x00000000-0xffffffffffffffff]

[    5.892770] amdgpu: probe of 0000:0d:00.0 failed with error -22

So looks like the Vega can't be initialized for some reason. I tried to google the error message but didn't find anything useful, also I'm not very familiar with Linux. Please help!
Complete dmesg output is in the attachment.
Thanks!

This topic was modified 1 month ago

Setup 1: Apple iMac 2017 21.5” 4K + eGPU
dGPU: AMD Radeon Pro 560
eGPU: ASUS Strix AMD Radeon Vega 56 via Thunderbolt 3 (AKiTiO Node)
OS: macOS Mojave 10.14.6, Windows 10 1809, Kali Linux 2019.2
Setup 2: Apple Mac mini 2018 + eGPU
iGPU: Intel UHD Graphics 630
eGPU: AMD Radeon RX 570 MXM via Thunderbolt 3 (Sonnet Breakaway Puck)
OS: macOS Catalina 10.15 Beta, Windows 10 1903


ReplyQuote
Topic Tags
nu_ninja
(@nu_ninja)
Estimable Member
Joined: 1 year ago
 

The line:

[    4.150096] [drm:amdgpu_get_bios [amdgpu]] *ERROR* ACPI VFCT table present but broken (too short #2)

Makes me think it's an ACPI table issue. So I'd first try the kernel parameter pci=nocrs and see what that does. As noted in the kernel documentation on this parameter. "If you need to use this, please report a bug"

Mid-2012 13" Macbook Pro (MacBookPro9,2) TB1 -> RX 460/560 (AKiTiO Node/Thunder2)
+ macOS 10.14+Win10
+ Linux Mint 19.1


ReplyQuote
i0ntempest
(@i0ntempest)
Trusted Member
Joined: 2 years ago
 

@nu_ninja

Thanks for the tip, I'll try when I have time. Also sorry for didn't respond in time.

Setup 1: Apple iMac 2017 21.5” 4K + eGPU
dGPU: AMD Radeon Pro 560
eGPU: ASUS Strix AMD Radeon Vega 56 via Thunderbolt 3 (AKiTiO Node)
OS: macOS Mojave 10.14.6, Windows 10 1809, Kali Linux 2019.2
Setup 2: Apple Mac mini 2018 + eGPU
iGPU: Intel UHD Graphics 630
eGPU: AMD Radeon RX 570 MXM via Thunderbolt 3 (Sonnet Breakaway Puck)
OS: macOS Catalina 10.15 Beta, Windows 10 1903


ReplyQuote
i0ntempest
(@i0ntempest)
Trusted Member
Joined: 2 years ago
 

@nu_ninja

I think I just broke something. I booted into kali, did a dist-upgrade and rebooted, now it complains "amdgpu requires firmware installed" and gdm login screen doesn't show up anymore. It won't even drive the internal 560 Pro now. I do have firmware-linux* installed.
I took a look at dpkg.log, and found mesa related packages were upgraded from 19.1.2-1 to 19.1.4-1. I suspect there's something wrong with the new packages.
Anything you could suggest me to try?
Thanks!

Setup 1: Apple iMac 2017 21.5” 4K + eGPU
dGPU: AMD Radeon Pro 560
eGPU: ASUS Strix AMD Radeon Vega 56 via Thunderbolt 3 (AKiTiO Node)
OS: macOS Mojave 10.14.6, Windows 10 1809, Kali Linux 2019.2
Setup 2: Apple Mac mini 2018 + eGPU
iGPU: Intel UHD Graphics 630
eGPU: AMD Radeon RX 570 MXM via Thunderbolt 3 (Sonnet Breakaway Puck)
OS: macOS Catalina 10.15 Beta, Windows 10 1903


ReplyQuote
nu_ninja
(@nu_ninja)
Estimable Member
Joined: 1 year ago
 

@i0ntempest
I don't use Kali, but its certainly possible that the newest mesa packages cause problems. Honestly at this point I would just boot into a non-graphical mode, grab any important files, and then just reinstall the whole distro. You could try just rolling back that one package but personally I'd prefer to just start fresh.

Mid-2012 13" Macbook Pro (MacBookPro9,2) TB1 -> RX 460/560 (AKiTiO Node/Thunder2)
+ macOS 10.14+Win10
+ Linux Mint 19.1


ReplyQuote
i0ntempest
(@i0ntempest)
Trusted Member
Joined: 2 years ago
 

@nu_ninja

K thanks, seems that's the only way now. I might wait for another upgrade and if it still breaks then I'll reinstall the whole OS.

Setup 1: Apple iMac 2017 21.5” 4K + eGPU
dGPU: AMD Radeon Pro 560
eGPU: ASUS Strix AMD Radeon Vega 56 via Thunderbolt 3 (AKiTiO Node)
OS: macOS Mojave 10.14.6, Windows 10 1809, Kali Linux 2019.2
Setup 2: Apple Mac mini 2018 + eGPU
iGPU: Intel UHD Graphics 630
eGPU: AMD Radeon RX 570 MXM via Thunderbolt 3 (Sonnet Breakaway Puck)
OS: macOS Catalina 10.15 Beta, Windows 10 1903


ReplyQuote