Bugzilla – Bug 1219517
Kernel 6.7.2 AMD GPU random system freeze or output no video
Last modified: 2024-02-28 15:53:25 UTC
Recently, I experienced two problems: 1. System random freeze. Can only force shutdown by press power button. 2. Cannot boot. Output no video signal. No UEFI logo. No Grub menu. I am using AMD GPU. Found some Arch Linux users have similar issues: https://bbs.archlinux.org/viewtopic.php?id=292442 Operating System: openSUSE Tumbleweed 20240131 KDE Plasma Version: 5.27.10 KDE Frameworks Version: 5.114.0 Qt Version: 5.15.12 Kernel Version: 6.7.2-1-default (64-bit) Graphics Platform: Wayland Processors: 12 × AMD Ryzen 5 5600X 6-Core Processor Memory: 31.3 GiB of RAM Graphics Processor: AMD Radeon Graphics RX 6700 Manufacturer: Micro-Star International Co., Ltd. Product Name: MS-7C94 System Version: 1.0
Related post from Reddit: https://www.reddit.com/r/openSUSE/comments/1ahi5fn/kernel_672_fan_woes/ It is more and more clear that the issue is related to kernel 6.7.2.
Please check the behavior with 6.7.3 or later kernel in OBS Kernel:stable repo http://download.opensuse.org/repositories/Kernel:/stable/standard/ If the problem persists, verify the latest 6.8-rc kernel in OBS Kernel:HEAD http://download.opensuse.org/repositories/Kernel:/HEAD/standard/ If it's still seen in 6.8-rc, report to the upstream devs at gitlab.freedesktop.org issues.
(In reply to Takashi Iwai from comment #2) > Please check the behavior with 6.7.3 or later kernel in OBS Kernel:stable > repo > http://download.opensuse.org/repositories/Kernel:/stable/standard/ > > If the problem persists, verify the latest 6.8-rc kernel in OBS Kernel:HEAD > http://download.opensuse.org/repositories/Kernel:/HEAD/standard/ > > If it's still seen in 6.8-rc, report to the upstream devs at > gitlab.freedesktop.org issues. I tried all these kernel version but still get random system freeze. Today I finally captured the logs when system freeze (I was playing YouTube with Firefox, nothing else is running.): 2月 20 19:00:12 localhost kernel: BTRFS warning (device nvme0n1p2): checksum verify failed on logical 1758335664128 mirror 1 wanted 0x2e8ebbf4 found 0x2ed755e2 level 0 2月 20 19:00:12 localhost kernel: BTRFS info (device nvme0n1p2): read error corrected: ino 0 off 1758335664128 (dev /dev/nvme0n1p2 sector 883506304) 2月 20 19:00:12 localhost kernel: BTRFS info (device nvme0n1p2): read error corrected: ino 0 off 1758335668224 (dev /dev/nvme0n1p2 sector 883506312) 2月 20 19:00:12 localhost kernel: BTRFS info (device nvme0n1p2): read error corrected: ino 0 off 1758335672320 (dev /dev/nvme0n1p2 sector 883506320) 2月 20 19:00:12 localhost kernel: BTRFS info (device nvme0n1p2): read error corrected: ino 0 off 1758335676416 (dev /dev/nvme0n1p2 sector 883506328) 2月 20 19:00:12 localhost kernel: BTRFS warning (device nvme0n1p2): checksum verify failed on logical 491194335232 mirror 1 wanted 0x9cfbbb3e found 0xb0a41980 level 0 2月 20 19:00:12 localhost kernel: BTRFS info (device nvme0n1p2): read error corrected: ino 0 off 491194335232 (dev /dev/nvme0n1p2 sector 42204000) 2月 20 19:00:12 localhost kernel: BTRFS info (device nvme0n1p2): read error corrected: ino 0 off 491194339328 (dev /dev/nvme0n1p2 sector 42204008) 2月 20 19:00:12 localhost kernel: BTRFS info (device nvme0n1p2): read error corrected: ino 0 off 491194343424 (dev /dev/nvme0n1p2 sector 42204016) 2月 20 19:00:12 localhost kernel: BTRFS info (device nvme0n1p2): read error corrected: ino 0 off 491194347520 (dev /dev/nvme0n1p2 sector 42204024) 2月 20 19:00:12 localhost kernel: BTRFS warning (device nvme0n1p2): checksum verify failed on logical 781902266368 mirror 1 wanted 0x6ebf7f50 found 0xafea776b level 0 2月 20 19:00:12 localhost kernel: BTRFS info (device nvme0n1p2): read error corrected: ino 0 off 781902266368 (dev /dev/nvme0n1p2 sector 491765984) 2月 20 19:00:12 localhost kernel: BTRFS info (device nvme0n1p2): read error corrected: ino 0 off 781902270464 (dev /dev/nvme0n1p2 sector 491765992) 2月 20 19:00:12 localhost kernel: BTRFS warning (device nvme0n1p2): checksum verify failed on logical 1756486762496 mirror 1 wanted 0x84dd47fe found 0xea1c76ef level 0 2月 20 19:00:12 localhost kernel: BTRFS warning (device nvme0n1p2): checksum verify failed on logical 1756486762496 mirror 2 wanted 0x84dd47fe found 0x908fc266 level 0 2月 20 19:00:12 localhost kernel: BTRFS error (device nvme0n1p2): qgroup scan failed with -5 I guess either my btrfs is broken or my SSD is broken.
Yes, it smells more like a filesystem problem. Please try to repair the filesystem at first. Do you see any else traces about the amdgpu crash or such?
(In reply to Takashi Iwai from comment #4) > Yes, it smells more like a filesystem problem. Please try to repair the > filesystem at first. > > Do you see any else traces about the amdgpu crash or such? No. I think it is not related to amdgpu.
This might be related: https://gitlab.freedesktop.org/drm/amd/-/issues/3132 I backported: commit 3a9626c816db901def438dc2513622e281186d39 Author: Mario Limonciello <mario.limonciello@amd.com> Date: Wed Feb 7 23:52:55 2024 -0600 drm/amd: Stop evicting resources on APUs in suspend to stable.
I can confirm now, the problem is caused by hardware. The CPU or DRAM is too lose cause system failure. Re-install CPU and DRAM solved the issue.