Bug 1227153

Summary: Leap 15.6 Linux 6.4.0.150600.23.7 boot hanger; former Linux ~.21 works
Product: [openSUSE] openSUSE Distribution Reporter: Peter Thoms <p.thoms>
Component: KernelAssignee: openSUSE Kernel Bugs <kernel-bugs>
Status: RESOLVED FIXED QA Contact: E-mail List <qa-bugs>
Severity: Normal    
Priority: P5 - None CC: meissner, mrmazda, msuchanek, p.thoms, tiwai
Version: Leap 15.6Flags: tiwai: needinfo? (p.thoms)
Target Milestone: ---   
Hardware: x86-64   
OS: Other   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: sudo inxi -Fxz
/var/log/warn
Boot hanger between "Rescue Mode" and "end of listing"
Screenshot
Screenshot
"3" instead of nomodeset: screenshot
log of first start today
first start today
dsmg kdump, only man start posible

Description Peter Thoms 2024-06-28 10:02:00 UTC
Hello,

after Update to Linux 6.4.0.150600.23.7 my PC runs into a boot hanger.
Prompt on Pos. 1
Comment 1 Felix Miata 2024-06-29 04:02:07 UTC
When this boot completes, does Alt-F3 bring you to another login prompt? If yes and you login as root, what results from 'systemctl restart xdm'?

Note Bugzilla is not intended to be a support forum. A report like this should be directed to https://forums.opensuse.org/c/english/install-boot-login/18 or the mailing list archived at https://lists.opensuse.org/archives/list/support@lists.opensuse.org/. Also details are needed, such as PC specifications, your GPU in particular, and details of the update process you employed.
Comment 2 Peter Thoms 2024-06-29 04:29:51 UTC
Hello Felix.
-> Alt + F3 is inactive
Comment 3 Peter Thoms 2024-06-29 13:47:10 UTC
Created attachment 875779 [details]
sudo inxi -Fxz

sudo inxi -Fxz
Comment 4 Peter Thoms 2024-06-30 06:47:21 UTC
Hello,

recovery mode works.
Boot hanger short after enter "exit" of recovery mode.
Comment 5 Peter Thoms 2024-06-30 09:09:30 UTC
Created attachment 875783 [details]
/var/log/warn
Comment 6 Peter Thoms 2024-06-30 10:46:27 UTC
Created attachment 875784 [details]
Boot hanger between "Rescue Mode" and "end of listing"

Boot hanger between "Rescue Mode" and "end of listing"
Comment 7 Peter Thoms 2024-06-30 12:02:51 UTC
Created attachment 875785 [details]
Screenshot

Exit Rescue Mode: running into the hanger
Comment 8 Peter Thoms 2024-06-30 16:56:34 UTC
Created attachment 875786 [details]
Screenshot

Exit Rescue Mode: running into the hanger
Comment 9 Takashi Iwai 2024-07-08 11:31:44 UTC
When you boot with "nomodeset" boot option, does the new kernel boot?
Comment 10 Peter Thoms 2024-07-08 11:51:07 UTC
Starting with nomodeset:
mobile mouse pointer at a black screen, also a boot hanger.
Comment 11 Takashi Iwai 2024-07-08 12:13:46 UTC
OK, then please try to boot without GUI, i.e. in runlevel 3 (pass "3" to boot option).  Can you login there?
Comment 12 Peter Thoms 2024-07-08 12:34:56 UTC
via recovery mode of linux ~23.7?
Please, give the command line.
Comment 13 Peter Thoms 2024-07-08 14:11:23 UTC
Ahh:
I added "pass=3", instead of "nomodeset"

The login screen started.
After entering my password the password dummys went into light grey and the screen freezes into a hanger.
Comment 14 Takashi Iwai 2024-07-08 14:31:53 UTC
After hanging with "3" boot option, can you try to reboot with the following key combo?
  Alt-Sysrq-S
  Alt-Sysrq-B
??  Sysrq key is often mapped as "Print" (or "Druck") key.

And after rebooting from that, please check the kernel messages of the previous boot, usually recorded in /var/log/messages.  There can be some leftover stack traces, and that would be more interesting.
Comment 15 Michal Suchanek 2024-07-08 14:34:00 UTC
(In reply to Peter Thoms from comment #13)
> Ahh:
> I added "pass=3", instead of "nomodeset"

It should not be "pass=3", only "3".
Comment 16 Peter Thoms 2024-07-08 14:46:43 UTC
Created attachment 875943 [details]
"3" instead of nomodeset: screenshot

"3" instead of nomodeset: screenshot of behavior
Comment 17 Takashi Iwai 2024-07-08 15:27:02 UTC
(In reply to Peter Thoms from comment #16)
> Created attachment 875943 [details]
> "3" instead of nomodeset: screenshot
> 
> "3" instead of nomodeset: screenshot of behavior

The screenshot shows some piece of crash messages.  And now we need the whole crash messages.  Could you try to get it from the previous kernel boot log after the reset?
Comment 18 Peter Thoms 2024-07-08 16:04:21 UTC
I am very sorry.

I tried under the folder /var/logs/..all messages but can not find anything we see on the last screenshot.

Please give me an further advice, so I  will try again until tomorrow.
Comment 19 Takashi Iwai 2024-07-08 16:30:38 UTC
Doesn't your /var/log/messages file contain any the trace of kernel crashes in the previous sessions at all?
Comment 20 Peter Thoms 2024-07-08 17:29:43 UTC
No,  maybe the part before the crash. I need expanded Timestamp to differ.
Tomorrow I will try again and have a better focus at the timestamp.
Comment 21 Peter Thoms 2024-07-09 07:49:02 UTC
Created attachment 875956 [details]
log of first start today

file log of first start today option "3"
Comment 22 Peter Thoms 2024-07-09 07:49:45 UTC
Created attachment 875957 [details]
first start today

file log of first start today option "3"
Comment 23 Peter Thoms 2024-07-09 07:52:09 UTC
I tryed to get the messages of first start via live cd (knoppix) but var was shown as empty
Comment 24 Takashi Iwai 2024-07-09 08:13:22 UTC
If /var/log/messages doesn't contain the kernel crash messages, you'd need to set up kdump to catch the crash instead.
Try to install via yast2-kdump.  It might be better to provide more RAM size than suggested there, to be sure.

After setting up kdump, you can verify whether the crash dump works via alt-sysrq-C key combo.  Boot with the previous kernel (that should still works), test the key combo and check whether you got the crash dump at /var/crash/*.
If this looks working, boot with the new kernel.  The kernel crash should be triggered automatically at crashing.

If crash dump worked for the new kernel (at crashing), please upload the dmesg output found in the corresponding /var/crash/* directory.
Comment 25 Peter Thoms 2024-07-09 09:04:23 UTC
Created attachment 875958 [details]
dsmg kdump, only man start posible

kdump of crash, man. start
Linux started via grub, no editing of grub menu
Comment 26 Takashi Iwai 2024-07-09 09:56:56 UTC
Thanks!  Now it shows a clear crash stack trace:

[    9.434527] e1000e 0000:00:19.0 eth0: NIC Link is Up 100 Mbps Half Duplex, Flow Control: Rx/Tx
[    9.434545] BUG: scheduling while atomic: kworker/2:0/30/0x00000002
(snip)
[    9.434638] CPU: 2 PID: 30 Comm: kworker/2:0 Kdump: loaded Tainted: G                   n 6.4.0-150600.23.7-default #1 SLE15-SP6 128952646fcb1614c051ed5f88ec9aef64f90f32
[    9.434641] Hardware name: FUJITSU ESPRIMO P520/D3220-A1, BIOS V4.6.5.4 R1.46.0 for D3220-A1x 08/29/2018
[    9.434642] Workqueue: events e1000_watchdog_task [e1000e]
[    9.434662] Call Trace:
[    9.434664]  <TASK>
[    9.434666]  dump_stack_lvl+0x57/0x80
[    9.434670]  __schedule_bug+0x56/0x70
[    9.434674]  __schedule+0x1146/0x1540
[    9.434678]  ? wakeup_preempt+0x29/0x60
[    9.434681]  ? ttwu_do_activate+0x5d/0x1e0
[    9.434683]  ? try_to_wake_up+0x408/0x5e0
[    9.434685]  schedule+0x24/0xb0
[    9.434688]  schedule_hrtimeout_range_clock+0xa8/0x120
[    9.434691]  ? __pfx_hrtimer_wakeup+0x10/0x10
[    9.434696]  usleep_range_state+0x5b/0x90
[    9.434698]  e1000e_read_phy_reg_mdic.part.3+0x7e/0x240 [e1000e 867f7757ca0e3d2299c7f92537e261b634b24512]
[    9.434713]  e1000e_update_stats+0x4c4/0x6e0 [e1000e 867f7757ca0e3d2299c7f92537e261b634b24512]
[    9.434725]  e1000_watchdog_task+0x157/0x890 [e1000e 867f7757ca0e3d2299c7f92537e261b634b24512]
[    9.434739]  ? vfree+0x17b/0x2d0
[    9.434742]  process_one_work+0x226/0x440
[    9.434745]  worker_thread+0x2a/0x3b0
[    9.434748]  ? __pfx_worker_thread+0x10/0x10
[    9.434750]  kthread+0xe1/0x120
[    9.434752]  ? __pfx_kthread+0x10/0x10
[    9.434753]  ret_from_fork+0x2c/0x50
[    9.434756]  </TASK>

And this looks like the bug that was recently fixed (in the upstream commit 387f295cb2150ed164905b648d76dfcbd3621778).  The fix has been already backported to SLE15-SP6 git branch.

Could you test the latest kernel in OBS Kernel:SLE15-SP6 repo?
  http://download.opensuse.org/repositories/Kernel:/SLE15-SP6/pool/
Comment 27 Peter Thoms 2024-07-09 10:00:39 UTC
When I get command line support, I can try.
Comment 28 Peter Thoms 2024-07-16 15:42:57 UTC
It works,

thank you very much!

Command line:
zypper addrepo http://download.opensuse.org/repositories/Kernel:/SLE15-SP6/pool/
zypper refresh
zypper install kernel-default-6.4.0-150600.251.1.g567c8c9.x86_64


Peter