Bug 1215328 - Upgrade to kernel 6.5.2 => can't get to SDDM. Boot stops at prompt
Summary: Upgrade to kernel 6.5.2 => can't get to SDDM. Boot stops at prompt
Status: NEW
Alias: None
Product: openSUSE Tumbleweed
Classification: openSUSE
Component: Kernel (show other bugs)
Version: Current
Hardware: x86-64 openSUSE Tumbleweed
: P5 - None : Normal (vote)
Target Milestone: ---
Assignee: openSUSE Kernel Bugs
QA Contact: E-mail List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-09-14 07:12 UTC by Mikael Widersten
Modified: 2023-10-28 13:23 UTC (History)
6 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---
jlee: needinfo? (bingmybong)


Attachments
dmesg for good boot of kernel 6.4 (71.85 KB, text/plain)
2023-09-19 09:49 UTC, Ian Powell
Details
Hardware info fromYast (700.85 KB, application/x-troff-man)
2023-09-19 15:43 UTC, Ian Powell
Details
boot log via journalctl -k (88.50 KB, text/plain)
2023-09-22 08:07 UTC, Ian Powell
Details
dmesg log (72.23 KB, text/plain)
2023-09-22 08:11 UTC, Ian Powell
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Mikael Widersten 2023-09-14 07:12:53 UTC
User-Agent:       Mozilla/5.0 (X11; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/117.0
Build Identifier: 

After upgrade to kernel 6.5.2, the boot ends at a non-responsive prompt. Hard reboot is required. Boot with 6.4.12 is fine.

HPZ2 G4 Workstation
Intel i7-8700
nVidia Quadro P2000
X11




Reproducible: Always

Steps to Reproduce:
1. Upgrade from kernel 6.4.12 to 6.5.2 using zypper dup.
2. reboot
3. boot ends at non-responsive prompt (all seems ok up to this point)


Expected Results:  
Boot should end at SDDM.

A Lenovi X1 laptop running intel HD graphics (Wayland) updated in parallel did not show the same problem - i.e. all is good on that machine.
Comment 1 Ian Powell 2023-09-16 13:45:24 UTC
After upgrade to kernel 6.5.2, the machine hangs after displaying my usual ACPI errors. Boot with 6.4.12 is fine.

Unable to find any data in the logs using "journalctl -k" for this boot - anyone else know where i can lookso i can post some helpful data?






opensuse:tumbleweed:20230914
Qt: 5.15.10 KDE Frameworks: 5.110.0 - KDE Plasma:  5.27.8 - kwin 5.27.8
kmail2 5.24.0 (23.08.0) - akonadiserver 5.24.0 (23.08.0) - Kernel:  6.4.12-1-default  - kernel-firmware-radeon  20230829
Comment 2 Stuart Rogers 2023-09-17 10:04:16 UTC
I have had a similar issue with kernels 6.5.2 and 6.5.3 in that my system simply gave up and rebooted. Adding acpi=noirq to the linux command in grub solved it and allowed my system to boot. My system is an AMD Ryzen 5 3400G with Radeon Vega Graphics on an MSI B450-A PRO MAX motherboard using the latest BIOS from July this year.
Comment 3 Takashi Iwai 2023-09-17 15:59:14 UTC
It's pretty difficult to analyze without logs, unfortunately.
And, in general, the problem can be totally different although the symptom looks similar; the only common issue is that "system doesn't boot".

At first, try to boot with nomodeset boot option.  (Also without Nvidia binary driver, if any, too.)  This will exclude the native graphics driver, and if this makes things booting again, it means that the problem is the graphics.

Also, check the latest kernels in OBS Kernel:stable and Kernel:HEAD repos (6.5.x and 6.6-rc).
Comment 4 Ian Powell 2023-09-19 06:45:07 UTC
(In reply to Stuart Rogers from comment #2)
> I have had a similar issue with kernels 6.5.2 and 6.5.3 in that my system
> simply gave up and rebooted. Adding acpi=noirq to the linux command in grub
> solved it and allowed my system to boot. My system is an AMD Ryzen 5 3400G
> with Radeon Vega Graphics on an MSI B450-A PRO MAX motherboard using the
> latest BIOS from July this year.

Thanks, gave it a try but no luck for me
Comment 5 Ian Powell 2023-09-19 06:46:05 UTC
(In reply to Takashi Iwai from comment #3)
> It's pretty difficult to analyze without logs, unfortunately.
> And, in general, the problem can be totally different although the symptom
> looks similar; the only common issue is that "system doesn't boot".
> 
> At first, try to boot with nomodeset boot option.  (Also without Nvidia
> binary driver, if any, too.)  This will exclude the native graphics driver,
> and if this makes things booting again, it means that the problem is the
> graphics.
> 
> Also, check the latest kernels in OBS Kernel:stable and Kernel:HEAD repos
> (6.5.x and 6.6-rc).

Thanks.  Unfortunately no issue with the graphics
Comment 6 Takashi Iwai 2023-09-19 06:51:20 UTC
(In reply to Ian Powell from comment #5)
> > Also, check the latest kernels in OBS Kernel:stable and Kernel:HEAD repos
> > (6.5.x and 6.6-rc).
> 
> Thanks.  Unfortunately no issue with the graphics

OK, and it still happens with 6.6-rc kernel, too?  If yes, this has to be reported to the upstream.
Comment 7 Takashi Iwai 2023-09-19 06:58:55 UTC
And, for each reporter: please give the dmesg output and the hwinfo from the working case (6.4.x?).  Use attachments, not paste on the form.
If you can catch the dmesg output at the failed boot, too, that'll be very helpful.
Comment 8 Ian Powell 2023-09-19 09:44:04 UTC
(In reply to Takashi Iwai from comment #6)
> (In reply to Ian Powell from comment #5)
> > > Also, check the latest kernels in OBS Kernel:stable and Kernel:HEAD repos
> > > (6.5.x and 6.6-rc).
> > 
> > Thanks.  Unfortunately no issue with the graphics
> 
> OK, and it still happens with 6.6-rc kernel, too?  If yes, this has to be
> reported to the upstream.

Getting 6.6 is beyond my skill, i'm just a user with very little knowledge
Comment 9 Ian Powell 2023-09-19 09:47:37 UTC
(In reply to Takashi Iwai from comment #7)
> And, for each reporter: please give the dmesg output and the hwinfo from the
> working case (6.4.x?).  Use attachments, not paste on the form.
> If you can catch the dmesg output at the failed boot, too, that'll be very
> helpful.

I've attached the dmesg from the working boot but there is nothing in any log file (journalctl -k, /var/log/boot.msg, /var/log/boot/omsg) i can find relating to the failed 6.5 boots
Comment 10 Ian Powell 2023-09-19 09:49:51 UTC
Created attachment 869592 [details]
dmesg for good boot of kernel 6.4

hope "dmesg" output for the successful boot helps
Comment 11 Takashi Iwai 2023-09-19 10:46:43 UTC
(In reply to Ian Powell from comment #8)
> (In reply to Takashi Iwai from comment #6)
> > (In reply to Ian Powell from comment #5)
> > > > Also, check the latest kernels in OBS Kernel:stable and Kernel:HEAD repos
> > > > (6.5.x and 6.6-rc).
> > > 
> > > Thanks.  Unfortunately no issue with the graphics
> > 
> > OK, and it still happens with 6.6-rc kernel, too?  If yes, this has to be
> > reported to the upstream.
> 
> Getting 6.6 is beyond my skill, i'm just a user with very little knowledge

You just need to install the kernel from OBS Kernel:HEAD repo
  http://download.opensuse.org/repositories/Kernel:/HEAD/standard/
Download kernel-default.rpm and install it via zypper install.
Once after testing, you can uninstall it again.

Note that it's an official build, hence you'd need to turn off Secure Boot.
Comment 12 Takashi Iwai 2023-09-19 10:48:40 UTC
(In reply to Ian Powell from comment #10)
> Created attachment 869592 [details]
> dmesg for good boot of kernel 6.4
> 
> hope "dmesg" output for the successful boot helps

Thanks.  Also the output from hwinfo, too.

This looks like an AMD platform, and I'm not sure whether it's the same problem as the original report.
Comment 13 Ian Powell 2023-09-19 15:43:45 UTC
Created attachment 869600 [details]
Hardware info fromYast
Comment 14 Ian Powell 2023-09-19 16:01:34 UTC
(In reply to Takashi Iwai from comment #11)
> (In reply to Ian Powell from comment #8)
> > (In reply to Takashi Iwai from comment #6)
> > > (In reply to Ian Powell from comment #5)
> > > > > Also, check the latest kernels in OBS Kernel:stable and Kernel:HEAD repos
> > > > > (6.5.x and 6.6-rc).
> > > > 
> > > > Thanks.  Unfortunately no issue with the graphics
> > > 
> > > OK, and it still happens with 6.6-rc kernel, too?  If yes, this has to be
> > > reported to the upstream.
> > 
> > Getting 6.6 is beyond my skill, i'm just a user with very little knowledge
> 
> You just need to install the kernel from OBS Kernel:HEAD repo
>   http://download.opensuse.org/repositories/Kernel:/HEAD/standard/
> Download kernel-default.rpm and install it via zypper install.
> Once after testing, you can uninstall it again.
> 
> Note that it's an official build, hence you'd need to turn off Secure Boot.

Same problem, machine hangs at the same spot
Comment 15 Takashi Iwai 2023-09-21 07:40:02 UTC
(In reply to Ian Powell from comment #14)
> (In reply to Takashi Iwai from comment #11)
> > (In reply to Ian Powell from comment #8)
> > > (In reply to Takashi Iwai from comment #6)
> > > > (In reply to Ian Powell from comment #5)
> > > > > > Also, check the latest kernels in OBS Kernel:stable and Kernel:HEAD repos
> > > > > > (6.5.x and 6.6-rc).
> > > > > 
> > > > > Thanks.  Unfortunately no issue with the graphics
> > > > 
> > > > OK, and it still happens with 6.6-rc kernel, too?  If yes, this has to be
> > > > reported to the upstream.
> > > 
> > > Getting 6.6 is beyond my skill, i'm just a user with very little knowledge
> > 
> > You just need to install the kernel from OBS Kernel:HEAD repo
> >   http://download.opensuse.org/repositories/Kernel:/HEAD/standard/
> > Download kernel-default.rpm and install it via zypper install.
> > Once after testing, you can uninstall it again.
> > 
> > Note that it's an official build, hence you'd need to turn off Secure Boot.
> 
> Same problem, machine hangs at the same spot

Which spot?

Please boot 6.6-rc kernel after removing "quiet" and "splash=silent" boot options.  This can show more clearly.  Does it hang before starting the desktop screen?
Comment 16 Ian Powell 2023-09-22 08:07:19 UTC
Created attachment 869684 [details]
boot log via journalctl -k
Comment 17 Jiri Slaby 2023-09-22 08:09:46 UTC
sddm-greeter[1264]: memfd_create() called without MFD_EXEC or MFD_NOEXEC_SEAL set

Is this it?
Comment 18 Ian Powell 2023-09-22 08:11:20 UTC
Created attachment 869685 [details]
dmesg log
Comment 19 Jiri Slaby 2023-09-22 08:13:40 UTC
(In reply to Jiri Slaby from comment #17)
> sddm-greeter[1264]: memfd_create() called without MFD_EXEC or
> MFD_NOEXEC_SEAL set
> 
> Is this it?

Nope, it's just a warning, removed later in 9da413503489.

I don't see any problems. Could you attach journalctl -b log (not only the kernel -k one)?
Comment 20 Ian Powell 2023-09-22 08:14:51 UTC
(In reply to Takashi Iwai from comment #15)
> (In reply to Ian Powell from comment #14)
> > (In reply to Takashi Iwai from comment #11)
> > > (In reply to Ian Powell from comment #8)
> > > > (In reply to Takashi Iwai from comment #6)
> > > > > (In reply to Ian Powell from comment #5)
> > > > > > > Also, check the latest kernels in OBS Kernel:stable and Kernel:HEAD repos
> > > > > > > (6.5.x and 6.6-rc).
> > > > > > 
> > > > > > Thanks.  Unfortunately no issue with the graphics
> > > > > 
> > > > > OK, and it still happens with 6.6-rc kernel, too?  If yes, this has to be
> > > > > reported to the upstream.
> > > > 
> > > > Getting 6.6 is beyond my skill, i'm just a user with very little knowledge
> > > 
> > > You just need to install the kernel from OBS Kernel:HEAD repo
> > >   http://download.opensuse.org/repositories/Kernel:/HEAD/standard/
> > > Download kernel-default.rpm and install it via zypper install.
> > > Once after testing, you can uninstall it again.
> > > 
> > > Note that it's an official build, hence you'd need to turn off Secure Boot.
> > 
> > Same problem, machine hangs at the same spot
> 
> Which spot?
> 
> Please boot 6.6-rc kernel after removing "quiet" and "splash=silent" boot
> options.  This can show more clearly.  Does it hang before starting the
> desktop screen?

I removed those options and it finally booted to my desktop and i'm writing this using that session.
I watched the boot and it took a really long time sorting out this part of the boot:-  
ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
<6>[   32.685196] ata3.00: configured for UDMA/100
<6>[   63.400878] ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
<6>[   63.404812] ata3.00: configured for UDMA/100
<6>[   94.120872] ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
<6>[   94.124848] ata3.00: configured for UDMA/100
<4>[  124.374199] ata3.00: limiting speed to UDMA/66:PIO4
<6>[  124.844199] ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
<6>[  124.848292] ata3.00: configured for UDMA/66
<6>[  157.270854] ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
<6>[  157.275119] ata3.00: configured for UDMA/66
<4>[  187.517509] ata3.00: limiting speed to UDMA/33:PIO4

I was gong to try it with kernel 5 but both versions have been purged from  the grub menu
Comment 21 Jiri Slaby 2023-09-22 08:20:43 UTC
(In reply to Ian Powell from comment #20)
> ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> <6>[   32.685196] ata3.00: configured for UDMA/100
> <6>[   63.400878] ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> <6>[   63.404812] ata3.00: configured for UDMA/100
> <6>[   94.120872] ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> <6>[   94.124848] ata3.00: configured for UDMA/100
> <4>[  124.374199] ata3.00: limiting speed to UDMA/66:PIO4
> <6>[  124.844199] ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> <6>[  124.848292] ata3.00: configured for UDMA/66
> <6>[  157.270854] ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> <6>[  157.275119] ata3.00: configured for UDMA/66
> <4>[  187.517509] ata3.00: limiting speed to UDMA/33:PIO4

Ha! So apparently you hit some 30 s timeout in the ATA layer.

Joey/Lee, any ideas what could break in 6.5 WRT this?
Comment 22 Jiri Slaby 2023-09-22 08:31:59 UTC
(In reply to Jiri Slaby from comment #21)
> Ha! So apparently you hit some 30 s timeout in the ATA layer.
> 
> Joey/Lee, any ideas what could break in 6.5 WRT this?

Apparently, 6.4 configured ata3 for UDMA/100 and was done with it.

6.5+ tries over and over again, every 30 s, until it downgrades to PIO4:

> [    [-0.614748]-]    {+0.646671]+} ata3: SATA max UDMA/133 abar m1024@0xfe02f000 port 0xfe02f200 irq 22
> [    [-1.114539]-]    {+1.156720]+} ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> [    [-1.116021]-]    {+1.160278]+} ata3.00: ATAPI: TSSTcorp CDDVDW SH-S223C, SB02, max UDMA/100
> [    [-1.117899]-]    {+1.167987]+} ata3.00: configured for UDMA/100
> {+[   32.680886] ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)+}
> {+[   32.685196] ata3.00: configured for UDMA/100+}
> {+[   63.400878] ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)+}
> {+[   63.404812] ata3.00: configured for UDMA/100+}
> {+[   94.120872] ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)+}
> {+[   94.124848] ata3.00: configured for UDMA/100+}
> {+[  124.374199] ata3.00: limiting speed to UDMA/66:PIO4+}
> {+[  124.844199] ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)+}
> {+[  124.848292] ata3.00: configured for UDMA/66+}
> {+[  157.270854] ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)+}
> {+[  157.275119] ata3.00: configured for UDMA/66+}
> {+[  187.517509] ata3.00: limiting speed to UDMA/33:PIO4+}
> {+[  187.987519] ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)+}
> {+[  187.991643] ata3.00: configured for UDMA/33+}
> {+[  219.944164] ata3.00: limiting speed to PIO4+}
> {+[  220.414187] ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)+}
> {+[  220.418605] ata3.00: configured for PIO4+}
Comment 23 Ian Powell 2023-09-22 09:30:17 UTC
(In reply to Jiri Slaby from comment #21)
> (In reply to Ian Powell from comment #20)
> > ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> > <6>[   32.685196] ata3.00: configured for UDMA/100
> > <6>[   63.400878] ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> > <6>[   63.404812] ata3.00: configured for UDMA/100
> > <6>[   94.120872] ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> > <6>[   94.124848] ata3.00: configured for UDMA/100
> > <4>[  124.374199] ata3.00: limiting speed to UDMA/66:PIO4
> > <6>[  124.844199] ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> > <6>[  124.848292] ata3.00: configured for UDMA/66
> > <6>[  157.270854] ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> > <6>[  157.275119] ata3.00: configured for UDMA/66
> > <4>[  187.517509] ata3.00: limiting speed to UDMA/33:PIO4
> 
> Ha! So apparently you hit some 30 s timeout in the ATA layer.
> 
> Joey/Lee, any ideas what could break in 6.5 WRT this?

So it looks like i hit the "its hung" too soon, should have waited a few more minutes.  I used my trusted "Num Lock" test to see if a system hangs.
Hopefully its easily fixed
Comment 24 Joey Lee 2023-09-22 10:05:34 UTC
Between v6.4..v6.5, it has 18 patches' name relates to libata. This one relates to speed:

From 12980c1f2f8a926dd634e27c700014b3246a99ec Mon Sep 17 00:00:00 2001
From: Damien Le Moal <dlemoal@kernel.org>
Date: Mon, 5 Jun 2023 08:16:32 +0900
Subject: [PATCH 04817/13561] ata: libata-eh: Use ata_ncq_enabled() in
 ata_eh_speed_down()

In ata_eh_speed_down(), instead of hard-coding the test on the device
flags to detect if NCQ is supported and enabled, use ata_ncq_enabled().

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: John Garry <john.g.garry@oracle.com>

Not sure that the patch is root cause. Maybe need bisecting.
Comment 25 Joey Lee 2023-09-22 10:26:45 UTC
(In reply to Joey Lee from comment #24)
> Between v6.4..v6.5, it has 18 patches' name relates to libata. This one
> relates to speed:
> 
> From 12980c1f2f8a926dd634e27c700014b3246a99ec Mon Sep 17 00:00:00 2001
> From: Damien Le Moal <dlemoal@kernel.org>
> Date: Mon, 5 Jun 2023 08:16:32 +0900
> Subject: [PATCH 04817/13561] ata: libata-eh: Use ata_ncq_enabled() in
>  ata_eh_speed_down()
> 
> In ata_eh_speed_down(), instead of hard-coding the test on the device
> flags to detect if NCQ is supported and enabled, use ata_ncq_enabled().
> 
> Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
> Reviewed-by: Hannes Reinecke <hare@suse.de>
> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
> Reviewed-by: John Garry <john.g.garry@oracle.com>
> 
> Not sure that the patch is root cause. Maybe need bisecting.

hm... openSUSE Tumbleweed enabled CONFIG_SATA_HOST=y. So the 

ata_ncq_enabled(dev) equals to (dev->flags & (ATA_DFLAG_PIO | ATA_DFLAG_NCQ | ATA_DFLAG_NCQ_OFF)

Looks that this patch should not change the behavior. 

No idea.. I think that we still need bisect v6.4..v6.5.
Comment 26 Ian Powell 2023-10-02 07:28:13 UTC
Has the problem been found yet?
Comment 27 Ian Powell 2023-10-13 10:02:04 UTC
I've just tried this with kernel 6.6 and the problem still exists
Comment 28 Joey Lee 2023-10-26 15:47:18 UTC
(In reply to Ian Powell from comment #27)
> I've just tried this with kernel 6.6 and the problem still exists

Is it possible that you can help to bisect upstream kernel for finding the issue patch between v6.4..v6.5 ?
Comment 29 Ian Powell 2023-10-28 13:23:12 UTC
(In reply to Joey Lee from comment #28)
> (In reply to Ian Powell from comment #27)
> > I've just tried this with kernel 6.6 and the problem still exists
> 
> Is it possible that you can help to bisect upstream kernel for finding the
> issue patch between v6.4..v6.5 ?

I wouldn't know where to start as i'm not a developer or sysadmin, just a user. The problem appears to have gone away with kernel 6.5.8-1 that i've just tried