Bug 1215464

Summary: kernel-firmware-amdgpu does not support Radeon GCN#1 1002:6611
Product: [openSUSE] openSUSE Tumbleweed Reporter: Felix Miata <mrmazda>
Component: BasesystemAssignee: Takashi Iwai <tiwai>
Status: RESOLVED FIXED QA Contact: E-mail List <qa-bugs>
Severity: Major    
Priority: P5 - None    
Version: Current   
Target Milestone: ---   
Hardware: Other   
OS: Other   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: full dmesg booting to graphical.target with drm.debug=0x06 log_buf_len=1M and radeon.si_support=0 amdgpu.si_support=1
full dmesg booting to multi-user.target with drm.debug=0x06 log_buf_len=1M

Description Felix Miata 2023-09-19 00:11:24 UTC
Original summary:
Frozen local I/O on Radeon GCN#1 1002:6611 long before init completes

Booting to full graphical default.target paints, but accepts no input.

This started more than 6 weeks ago, but I forgot about it until yesterday. Two separate TW installations on same PC (KDE3 on one, Plasma on the other) do this. Remote login works fine regardless of choice of any 6.2.x kernel through 6.4.12-1.13. Mageia 9 6.4.15, Fedora 38 6.3.12, Leap 15.4 5.14.21-150400.24.81, Leap 15.5 5.14.21-150500.55.12, Ubuntu 22.04 6.2.0-32 all work as expected.

# inxi -CS
System:
  Host: fi965 Kernel: 6.4.12-1-default arch: x86_64 bits: 64
    Console: pty pts/0 Distro: openSUSE Tumbleweed 20230915
CPU:
  Info: dual core model: Intel Core2 6700 bits: 64 type: MCP cache: L2: 4 MiB
  Speed (MHz): avg: 1596 min/max: 1596/2660 cores: 1: 1596 2: 1596
# inxi -Gxx --vs
inxi 3.3.29-00 (2023-08-15)
Graphics:
  Device-1: AMD Oland [Radeon HD 8570 / R5 430 OEM R7 240/340 Radeon 520 OEM]
    vendor: Dell driver: N/A arch: GCN-1 pcie: speed: 2.5 GT/s lanes: 8
    bus-ID: 01:00.0 chip-ID: 1002:6611
  Display: server: X.org v: 1.21.1.8 driver: X: loaded: amdgpu
    unloaded: fbdev,modesetting,vesa dri: radeonsi gpu: N/A
    display-ID: 00srv.ij.net:0
# cat /proc/cmdline
root=LABEL=sTWp10w71 ipv6.disable=1 net.ifnames=0 noresume 3
# dmesg | grep -i amdgpu
[   46.276935] [drm] amdgpu kernel modesetting enabled.
[   46.283526] amdgpu: CRAT table not found
[   46.283536] amdgpu: Virtual CRAT table created for CPU
[   46.283564] amdgpu: Topology: Add CPU node
[   46.283701] amdgpu 0000:01:00.0: amdgpu: SI support provided by radeon.
[   46.283706] amdgpu 0000:01:00.0: amdgpu: Use radeon.si_support=0 amdgpu.si_support=1 to override.

# dmesg | grep -i amdgpu
[    0.000000] Command line: root=LABEL=sTWp10w71 ipv6.disable=1 net.ifnames=0 radeon.si_support=0 amdgpu.si_support=1 noresume drm.debug=0x06 log_buf_len=1M 3
[    0.050604] Kernel command line: root=LABEL=sTWp10w71 ipv6.disable=1 net.ifnames=0 radeon.si_support=0 amdgpu.si_support=1 noresume drm.debug=0x06 log_buf_len=1M 3
[   44.790499] [drm] amdgpu kernel modesetting enabled.
[   44.790614] amdgpu: CRAT table not found
[   44.790619] amdgpu: Virtual CRAT table created for CPU
[   44.790638] amdgpu: Topology: Add CPU node
[   44.813831] amdgpu 0000:01:00.0: No more image in the PCI ROM
[   44.815179] amdgpu 0000:01:00.0: amdgpu: Fetched VBIOS from ROM BAR
[   44.815183] amdgpu: ATOM BIOS: 113-C5530300-105
[   44.815200] kfd kfd: amdgpu: OLAND  not supported in kfd
[   44.815253] amdgpu 0000:01:00.0: vgaarb: deactivate vga console
[   44.816121] amdgpu 0000:01:00.0: amdgpu: Trusted Memory Zone (TMZ) feature not supported
[   44.816124] amdgpu 0000:01:00.0: amdgpu: PCIE atomic ops is not supported
[   46.312377] amdgpu 0000:01:00.0: Direct firmware load for amdgpu/oland_mc.bin failed with error -2
[   46.312393] amdgpu 0000:01:00.0: amdgpu: si_mc: Failed to load firmware "amdgpu/oland_mc.bin"
[   46.312397] amdgpu 0000:01:00.0: amdgpu: Failed to load mc firmware!

[   46.312400] [drm:amdgpu_device_init [amdgpu]] *ERROR* sw_init of IP block <gmc_v6_0> failed -19

[   46.313567] amdgpu 0000:01:00.0: amdgpu: amdgpu_device_ip_init failed
[   46.313572] amdgpu 0000:01:00.0: amdgpu: Fatal error during GPU init
[   46.313577] amdgpu 0000:01:00.0: amdgpu: amdgpu: finishing device.

In latter of the two boots' dmesg excertps above last line on 80x25 screen was:
[    2.338202] device-mapper: core: CONFIG_IMA_DISABLE_HTABLE is disable

# rpm -qa | egrep 'input|drm|amdgpu|xorg' | sort
kernel-firmware-amdgpu-20230829-1.1.noarch
libdrm2-2.4.116-1.1.x86_64
libdrm_amdgpu1-2.4.116-1.1.x86_64
libdrm_intel1-2.4.116-1.1.x86_64
libdrm_nouveau2-2.4.116-1.1.x86_64
libdrm_radeon1-2.4.116-1.1.x86_64
libinput-udev-1.24.0-1.1.x86_64
libinput10-1.24.0-1.1.x86_64
libva-drm2-2.19.0-1.2.x86_64
libxcb-xinput0-1.16-1.1.x86_64
xf86-video-amdgpu-23.0.0-1.3.x86_64
xinput-1.6.4-1.3.x86_64
xorg-scripts-1.0.1-10.19.noarch
xorg-x11-7.6_1-16.18.noarch
xorg-x11-essentials-7.6_1-16.18.noarch
xorg-x11-fonts-7.6-45.1.noarch
xorg-x11-fonts-core-7.6-45.1.noarch
xorg-x11-server-21.1.8-1.5.x86_64
xorg-x11-server-Xvfb-21.1.8-1.5.x86_64
xorg-x11-Xvnc-1.13.1-3.4.x86_64
Comment 1 Felix Miata 2023-09-19 00:12:52 UTC
Created attachment 869582 [details]
full dmesg booting to graphical.target with drm.debug=0x06 log_buf_len=1M and radeon.si_support=0 amdgpu.si_support=1
Comment 2 Felix Miata 2023-09-19 00:15:15 UTC
Created attachment 869583 [details]
full dmesg booting to multi-user.target with drm.debug=0x06 log_buf_len=1M
Comment 3 Felix Miata 2023-09-19 10:55:28 UTC
# ls -gG /usr/lib/firmware/amdgpu/oland_mc.bin.xz
lrwxrwxrwx 1 25 Aug 31 07:55 /usr/lib/firmware/amdgpu/oland_mc.bin.xz -> ../radeon/oland_mc.bin.xz
# ls -gG /usr/lib/firmware/radeon/oland_mc.bin.xz
ls: cannot access '/usr/lib/firmware/radeon/oland_mc.bin.xz': No such file or directory
# zypper in kernel-firmware-radeon
...
# ls -gG /usr/lib/firmware/radeon/oland_mc.bin.xz
-rw-r--r-- 1 10528 Aug 31 07:52 /usr/lib/firmware/radeon/oland_mc.bin.xz
<reboot>
<problem solved>
# grep -i radeon /var/log/zypp/history | grep remove
2022-05-18 18:38:49|remove |kernel-firmware-radeon|20220509-1.1|noarch|root@fi965|
2023-08-10 22:21:14|remove |kernel-firmware-radeon|20230707-1.1|noarch|root@fi965|
#
I was trying to free some / filesystem space in August.
Comment 4 Takashi Iwai 2023-10-09 10:46:59 UTC
Likely a side-effect of fdupes.  I dropped the operation now.

Please reopen if the problem persists with the upcoming kernel-firmware-*-20231006.