|
Bugzilla – Full Text Bug Listing |
| Summary: | OOPS in amdgpu | ||
|---|---|---|---|
| Product: | [openSUSE] openSUSE Distribution | Reporter: | Andreas Jaeger <aj> |
| Component: | Kernel | Assignee: | Takashi Iwai <tiwai> |
| Status: | RESOLVED FIXED | QA Contact: | E-mail List <qa-bugs> |
| Severity: | Normal | ||
| Priority: | P5 - None | CC: | aj, mge |
| Version: | Leap 15.5 | ||
| Target Milestone: | --- | ||
| Hardware: | Other | ||
| OS: | Other | ||
| Whiteboard: | |||
| Found By: | --- | Services Priority: | |
| Business Priority: | Blocker: | --- | |
| Marketing QA Status: | --- | IT Deployment: | --- |
| Attachments: |
the two oops I could find in /var/log/messages
dmesg from home:tiwai:bsc1213578-2 dmesg from home:tiwai:bsc1213578-3 |
||
|
Description
Andreas Jaeger
2023-07-24 06:49:26 UTC
Created attachment 868390 [details]
the two oops I could find in /var/log/messages
hwinfo --gfxcard
31: PCI 500.0: 0300 VGA compatible controller (VGA)
[Created at pci.386]
Unique ID: Ddhb.uZbpCsxmrO5
Parent ID: JZZT.nyyq4tDu6x8
SysFS ID: /devices/pci0000:00/0000:00:08.1/0000:05:00.0
SysFS BusID: 0000:05:00.0
Hardware Class: graphics card
Model: "ATI Picasso"
Vendor: pci 0x1002 "ATI Technologies Inc"
Device: pci 0x15d8 "Picasso"
SubVendor: pci 0x17aa "Lenovo"
SubDevice: pci 0x5127
Revision: 0xd1
Driver: "amdgpu"
Driver Modules: "amdgpu"
Memory Range: 0xc0000000-0xcfffffff (ro,non-prefetchable)
Memory Range: 0xd0000000-0xd01fffff (ro,non-prefetchable)
I/O Ports: 0x1000-0x1fff (rw)
Memory Range: 0xd0500000-0xd057ffff (rw,non-prefetchable)
IRQ: 50 (no events)
Module Alias: "pci:v00001002d000015D8sv000017AAsd00005127bc03sc00i00"
Driver Info #0:
Driver Status: amdgpu is active
Driver Activation Cmd: "modprobe amdgpu"
Config Status: cfg=new, avail=yes, need=no, active=unknown
Attached to: #25 (PCI bridge)
Primary display adapter: #31
# hwinfo --monitor
35: None 00.0: 10002 LCD Monitor
[Created at monitor.125]
Unique ID: rdCR.mQXMLz_WQq5
Parent ID: Ddhb.uZbpCsxmrO5
Hardware Class: monitor
Model: "AUO LCD Monitor"
Vendor: AUO "AUO"
Device: eisa 0x573d
Serial ID: "0"
Resolution: 1920x1080@60Hz
Size: 309x174 mm
Year of Manufacture: 2018
Week of Manufacture: 0
Detailed Timings #0:
Resolution: 1920x1080
Horizontal: 1920 1936 1952 2080 (+16 +32 +160) -hsync
Vertical: 1080 1083 1088 1142 (+3 +8 +62) -vsync
Frequencies: 142.60 MHz, 68.56 kHz, 60.03 Hz
Config Status: cfg=new, avail=yes, need=no, active=unknown
Attached to: #25 (VGA compatible controller)
36: None 01.0: 10002 LCD Monitor
[Created at monitor.125]
Unique ID: wkFv.zdQ3vHfjlr1
Parent ID: Ddhb.uZbpCsxmrO5
Hardware Class: monitor
Model: "DELL U2419H"
Vendor: DEL "DELL"
Device: eisa 0x4148 "DELL U2419H"
Serial ID: "5ZC7SS2"
Resolution: 720x400@70Hz
Resolution: 640x480@60Hz
Resolution: 640x480@75Hz
Resolution: 800x600@60Hz
Resolution: 800x600@75Hz
Resolution: 1024x768@60Hz
Resolution: 1024x768@75Hz
Resolution: 1280x1024@75Hz
Resolution: 1152x864@75Hz
Resolution: 1280x1024@60Hz
Resolution: 1600x900@60Hz
Resolution: 1920x1080@60Hz
Size: 527x296 mm
Year of Manufacture: 2019
Week of Manufacture: 44
Detailed Timings #0:
Resolution: 1920x1080
Horizontal: 1920 2008 2052 2200 (+88 +132 +280) +hsync
Vertical: 1080 1084 1089 1125 (+4 +9 +45) +vsync
Frequencies: 148.50 MHz, 67.50 kHz, 60.00 Hz
Driver Info #0:
Max. Resolution: 1920x1080
Vert. Sync Range: 56-76 Hz
Hor. Sync Range: 30-83 kHz
Bandwidth: 148 MHz
Config Status: cfg=new, avail=yes, need=no, active=unknown
Attached to: #25 (VGA compatible controller)
37: None 02.0: 10002 LCD Monitor
[Created at monitor.125]
Unique ID: +rIN.8N48X7gRWVA
Parent ID: Ddhb.uZbpCsxmrO5
Hardware Class: monitor
Model: "DELL U2414H"
Vendor: DEL "DELL"
Device: eisa 0xa0b2 "DELL U2414H"
Serial ID: "X4J717CQ18UL"
Resolution: 720x400@70Hz
Resolution: 640x480@60Hz
Resolution: 640x480@75Hz
Resolution: 800x600@60Hz
Resolution: 800x600@75Hz
Resolution: 1024x768@60Hz
Resolution: 1024x768@75Hz
Resolution: 1280x1024@75Hz
Resolution: 1152x864@75Hz
Resolution: 1280x1024@60Hz
Resolution: 1600x900@60Hz
Resolution: 1600x1200@60Hz
Resolution: 1920x1080@60Hz
Size: 527x296 mm
Year of Manufacture: 2017
Week of Manufacture: 52
Detailed Timings #0:
Resolution: 1920x1080
Horizontal: 1920 2008 2052 2200 (+88 +132 +280) +hsync
Vertical: 1080 1084 1089 1125 (+4 +9 +45) +vsync
Frequencies: 148.50 MHz, 67.50 kHz, 60.00 Hz
Driver Info #0:
Max. Resolution: 1920x1080
Vert. Sync Range: 56-76 Hz
Hor. Sync Range: 30-83 kHz
Bandwidth: 148 MHz
Config Status: cfg=new, avail=yes, need=no, active=unknown
Attached to: #25 (VGA compatible controller)
Thanks. This looks like the upstream issue https://gitlab.freedesktop.org/drm/amd/-/issues/2314 I'm building yet another test kernel with some backports in OBS home:tiwai:bsc1213578. Please give it a try later once after the build finishes. And, I'm building yet two more test kernels in OBS home:tiwai:bsc1213578-2 and home:tiwai:bsc1213578-3 repos. The first one is another upstream fix, and please test it in anyway to check whether it gives more regression or not. The latter one is a downstream fix for NULL dereferences, and this should work around the Oops, at least. If the previous two kernels don't work, please check this one. If this is the only one that works, I'll add this workaround for the next update. Thanks, Takashi! Waiting for the builds now... Booted kernel-default-5.14.21-150500.1.1.g0e39bed.x86_64 from https://build.opensuse.org/repositories/home:tiwai:bsc1213578 - crashed when starting X11. No oops after reboot. Now to the next one.. I meant: No OOPS in /var/log/messages found Created attachment 868394 [details]
dmesg from home:tiwai:bsc1213578-2
home:tiwai:bsc1213578-2 crashed when connecting external monitors, attaching dmesg output.
Created attachment 868395 [details]
dmesg from home:tiwai:bsc1213578-3
home:tiwai:bsc1213578-3 produces an OOPS as well, see dmesg attachment.
BUT: I report this now from the system with two external monitors attached, so it recovered. I booted up without external monitors and then connected them.
$ uname -a
Linux t495s 5.14.21-150500.1.g06f3d0e-default #1 SMP PREEMPT_DYNAMIC Mon Jul 24 08:36:58 UTC 2023 (06f3d0e) x86_64 x86_64 x86_64 GNU/Linux
(In reply to Andreas Jaeger from comment #10) > Created attachment 868395 [details] > dmesg from home:tiwai:bsc1213578-3 > > home:tiwai:bsc1213578-3 produces an OOPS as well, see dmesg attachment. Those are no real crash but just kernel WARNINGs from ASSERT() macros. To be fixed, of course. > BUT: I report this now from the system with two external monitors attached, > so it recovered. I booted up without external monitors and then connected > them. > > $ uname -a > Linux t495s 5.14.21-150500.1.g06f3d0e-default #1 SMP PREEMPT_DYNAMIC Mon Jul > 24 08:36:58 UTC 2023 (06f3d0e) x86_64 x86_64 x86_64 GNU/Linux So, how is the behavior of *-3 kernel except for those kernel warnings? Does it still show other breakage? The latest kernel had initial a network connection problem and gnome-shell started without any extensions which I was later able to enable. After that I worked fine for an hour until I rebooted. I don't know whether the network and gnome-shell problems were related to the kernel. Let me try that kernel again... Rebooted, all fine. Will use it for the next 2 hours and report if any problems arise. No OOPS/assert - booted this time with external monitors attached directly. uname -a Linux t495s 5.14.21-150500.1.g06f3d0e-default #1 SMP PREEMPT_DYNAMIC Mon Jul 24 08:36:58 UTC 2023 (06f3d0e) x86_64 x86_64 x86_64 GNU/Linux Is there more bug to be fixed with the latest SLE15-SP5 kernel? (At best check with the kernel in OBS Kernel:SLE15-SP5 repo.) If yes, could you elaborate how to trigger it? Ok, download kernel from OBS Kernel:SLE15-SP5, uname -a reports: Linux t495s 5.14.21-150500.158.g6eb8d8a-default #1 SMP PREEMPT_DYNAMIC Thu Aug 3 12:29:06 UTC 2023 (6eb8d8a) x86_64 x86_64 x86_64 GNU/Linux Booted up fine, I'll run it now for some time and will then report back. Thanks, Takashi! Looking still fine! OK, then let's close now. Feel free to reopen if you hit the same bug (but maybe better to open another entry as it can be a different problem). SUSE-SU-2023:3302-1: An update that solves 28 vulnerabilities, contains two features and has 115 fixes can now be installed. Category: security (important) Bug References: 1150305, 1187829, 1193629, 1194869, 1206418, 1207129, 1207894, 1207948, 1208788, 1210335, 1210565, 1210584, 1210627, 1210780, 1210825, 1210853, 1211014, 1211131, 1211243, 1211738, 1211811, 1211867, 1212051, 1212256, 1212265, 1212301, 1212445, 1212456, 1212502, 1212525, 1212603, 1212604, 1212685, 1212766, 1212835, 1212838, 1212842, 1212846, 1212848, 1212861, 1212869, 1212892, 1212901, 1212905, 1212961, 1213010, 1213011, 1213012, 1213013, 1213014, 1213015, 1213016, 1213017, 1213018, 1213019, 1213020, 1213021, 1213024, 1213025, 1213032, 1213034, 1213035, 1213036, 1213037, 1213038, 1213039, 1213040, 1213041, 1213059, 1213061, 1213087, 1213088, 1213089, 1213090, 1213092, 1213093, 1213094, 1213095, 1213096, 1213098, 1213099, 1213100, 1213102, 1213103, 1213104, 1213105, 1213106, 1213107, 1213108, 1213109, 1213110, 1213111, 1213112, 1213113, 1213114, 1213116, 1213134, 1213167, 1213205, 1213206, 1213226, 1213233, 1213245, 1213247, 1213252, 1213258, 1213259, 1213263, 1213264, 1213272, 1213286, 1213287, 1213304, 1213417, 1213493, 1213523, 1213524, 1213533, 1213543, 1213578, 1213585, 1213586, 1213588, 1213601, 1213620, 1213632, 1213653, 1213705, 1213713, 1213715, 1213747, 1213756, 1213759, 1213777, 1213810, 1213812, 1213856, 1213857, 1213863, 1213867, 1213870, 1213871, 1213872 CVE References: CVE-2022-40982, CVE-2023-0459, CVE-2023-1829, CVE-2023-20569, CVE-2023-20593, CVE-2023-21400, CVE-2023-2156, CVE-2023-2166, CVE-2023-2430, CVE-2023-2985, CVE-2023-3090, CVE-2023-31083, CVE-2023-3111, CVE-2023-3117, CVE-2023-31248, CVE-2023-3212, CVE-2023-3268, CVE-2023-3389, CVE-2023-3390, CVE-2023-35001, CVE-2023-3567, CVE-2023-3609, CVE-2023-3611, CVE-2023-3776, CVE-2023-3812, CVE-2023-38409, CVE-2023-3863, CVE-2023-4004 Jira References: PED-4718, PED-4758 Sources used: openSUSE Leap 15.5 (src): kernel-livepatch-SLE15-SP5-RT_Update_3-1-150500.11.5.1, kernel-syms-rt-5.14.21-150500.13.11.1, kernel-source-rt-5.14.21-150500.13.11.1 SUSE Linux Enterprise Live Patching 15-SP5 (src): kernel-livepatch-SLE15-SP5-RT_Update_3-1-150500.11.5.1 SUSE Real Time Module 15-SP5 (src): kernel-syms-rt-5.14.21-150500.13.11.1, kernel-source-rt-5.14.21-150500.13.11.1 NOTE: This line indicates an update has been released for the listed product(s). At times this might be only a partial fix. If you have questions please reach out to maintenance coordination. SUSE-SU-2023:3311-1: An update that solves 15 vulnerabilities and has 27 fixes can now be installed. Category: security (important) Bug References: 1206418, 1207129, 1207948, 1210627, 1210780, 1210825, 1211131, 1211738, 1211811, 1212445, 1212502, 1212604, 1212766, 1212901, 1213167, 1213272, 1213287, 1213304, 1213417, 1213578, 1213585, 1213586, 1213588, 1213601, 1213620, 1213632, 1213653, 1213713, 1213715, 1213747, 1213756, 1213759, 1213777, 1213810, 1213812, 1213856, 1213857, 1213863, 1213867, 1213870, 1213871, 1213872 CVE References: CVE-2022-40982, CVE-2023-0459, CVE-2023-20569, CVE-2023-21400, CVE-2023-2156, CVE-2023-2166, CVE-2023-31083, CVE-2023-3268, CVE-2023-3567, CVE-2023-3609, CVE-2023-3611, CVE-2023-3776, CVE-2023-38409, CVE-2023-3863, CVE-2023-4004 Sources used: openSUSE Leap 15.5 (src): kernel-syms-5.14.21-150500.55.19.1, kernel-default-base-5.14.21-150500.55.19.1.150500.6.6.4, kernel-livepatch-SLE15-SP5_Update_3-1-150500.11.3.4, kernel-source-5.14.21-150500.55.19.1, kernel-obs-qa-5.14.21-150500.55.19.1, kernel-obs-build-5.14.21-150500.55.19.1 Basesystem Module 15-SP5 (src): kernel-default-base-5.14.21-150500.55.19.1.150500.6.6.4, kernel-source-5.14.21-150500.55.19.1 Development Tools Module 15-SP5 (src): kernel-obs-build-5.14.21-150500.55.19.1, kernel-syms-5.14.21-150500.55.19.1, kernel-source-5.14.21-150500.55.19.1 SUSE Linux Enterprise Live Patching 15-SP5 (src): kernel-livepatch-SLE15-SP5_Update_3-1-150500.11.3.4 NOTE: This line indicates an update has been released for the listed product(s). At times this might be only a partial fix. If you have questions please reach out to maintenance coordination. SUSE-SU-2023:3376-1: An update that solves 15 vulnerabilities and has 27 fixes can now be installed. Category: security (important) Bug References: 1206418, 1207129, 1207948, 1210627, 1210780, 1210825, 1211131, 1211738, 1211811, 1212445, 1212502, 1212604, 1212766, 1212901, 1213167, 1213272, 1213287, 1213304, 1213417, 1213578, 1213585, 1213586, 1213588, 1213601, 1213620, 1213632, 1213653, 1213713, 1213715, 1213747, 1213756, 1213759, 1213777, 1213810, 1213812, 1213856, 1213857, 1213863, 1213867, 1213870, 1213871, 1213872 CVE References: CVE-2022-40982, CVE-2023-0459, CVE-2023-20569, CVE-2023-21400, CVE-2023-2156, CVE-2023-2166, CVE-2023-31083, CVE-2023-3268, CVE-2023-3567, CVE-2023-3609, CVE-2023-3611, CVE-2023-3776, CVE-2023-38409, CVE-2023-3863, CVE-2023-4004 Sources used: openSUSE Leap 15.5 (src): kernel-syms-azure-5.14.21-150500.33.14.1, kernel-source-azure-5.14.21-150500.33.14.1 Public Cloud Module 15-SP5 (src): kernel-syms-azure-5.14.21-150500.33.14.1, kernel-source-azure-5.14.21-150500.33.14.1 NOTE: This line indicates an update has been released for the listed product(s). At times this might be only a partial fix. If you have questions please reach out to maintenance coordination. |