Bug 1214013 - Shutdown did not finish (PC was not powered off, no obvious error logged)
Summary: Shutdown did not finish (PC was not powered off, no obvious error logged)
Status: RESOLVED FIXED
Alias: None
Product: openSUSE Distribution
Classification: openSUSE
Component: Kernel (show other bugs)
Version: Leap 15.5
Hardware: Other Other
: P5 - None : Major (vote)
Target Milestone: ---
Assignee: openSUSE Kernel Bugs
QA Contact: E-mail List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-08-05 22:34 UTC by Ulrich Windl
Modified: 2024-05-27 08:46 UTC (History)
2 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments
Screenshot when shutdown should power off the PC (94.92 KB, image/jpeg)
2023-08-05 22:34 UTC, Ulrich Windl
Details
Screenshot of tty10 when shutdown "is finished" (PDF) (872.78 KB, application/pdf)
2023-08-17 09:11 UTC, Ulrich Windl
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Ulrich Windl 2023-08-05 22:34:41 UTC
Created attachment 868645 [details]
Screenshot when shutdown should power off the PC

Maybe the issue is related to an MD-RAID being verified (see Bug 1214004), but the system did not power off.

The last message in journal were:
Aug 05 01:47:01 pc.site systemd[1]: Stopped Monitoring of LVM2 mirrors, snapshots etc. using dmeventd or progress polling.
Aug 05 01:47:01 pc.site systemd[1]: Stopping Device-mapper event daemon...
Aug 05 01:47:01 pc.site dmeventd[1125]: dmeventd detected break while being idle for 0 second(s), exiting.
Aug 05 01:47:01 pc.site dmeventd[1125]: dmeventd shutting down.
Aug 05 01:47:01 pc.site systemd[1]: dm-event.service: Deactivated successfully.
Aug 05 01:47:01 pc.site systemd[1]: Stopped Device-mapper event daemon.
Aug 05 01:47:01 pc.site systemd[1]: Reached target System Shutdown.
Aug 05 01:47:01 pc.site systemd[1]: Reached target Late Shutdown Services.
Aug 05 01:47:01 pc.site systemd[1]: systemd-poweroff.service: Deactivated successfully.
Aug 05 01:47:01 pc.site systemd[1]: Finished System Power Off.
Aug 05 01:47:01 pc.site systemd[1]: Reached target System Power Off.
Aug 05 01:47:01 pc.site systemd[1]: Shutting down.
Aug 05 01:47:01 pc.site systemd-shutdown[1]: Syncing filesystems and block devices.
Aug 05 01:47:03 pc.site systemd-shutdown[1]: Sending SIGTERM to remaining processes...
Aug 05 01:47:03 pc.site haveged[1265]: haveged: Stopping due to signal 15
Aug 05 01:47:03 pc.site haveged[1265]: haveged starting up
Aug 05 01:47:03 pc.site systemd-journald[1132]: Journal stopped

I'm also adding a screenshot showing the last messages (that would last until I pressed the power button).

Kernel was 5.14.21-150500.55.12-default

The only other thing that looked unusual was this:
...
Aug 05 01:46:59 pc.site systemd[1]: Stopped Record System Boot/Shutdown in UTMP.
Aug 05 01:46:59 pc.site systemd[1]: Stopping Security Auditing Service...
Aug 05 01:46:59 pc.site systemd-cryptsetup[3123]: Device cr_md-uuid-fe1588c4:a2140388:5b4117ba:4e4339b9-part5 is still in use.
Aug 05 01:46:59 pc.site systemd-cryptsetup[3123]: Failed to deactivate: Device or resource busy
Aug 05 01:46:59 pc.site systemd[1]: systemd-cryptsetup@cr_md\x2duuid\x2dfe1588c4:a2140388:5b4117ba:4e4339b9\x2dpart5.service: Control process exited, code=exited, status=1>
Aug 05 01:46:59 pc.site systemd[1]: systemd-cryptsetup@cr_md\x2duuid\x2dfe1588c4:a2140388:5b4117ba:4e4339b9\x2dpart5.service: Failed with result 'exit-code'.
Aug 05 01:46:59 pc.site systemd[1]: Stopped Cryptography Setup for cr_md-uuid-fe1588c4:a2140388:5b4117ba:4e4339b9-part5.
Aug 05 01:46:59 pc.site auditd[1679]: The audit daemon is exiting.
Aug 05 01:46:59 pc.site kernel: kauditd_printk_skb: 50 callbacks suppressed
Aug 05 01:46:59 pc.site kernel: audit: type=1305 audit(1691192819.741:562): op=set audit_pid=0 old=1679 auid=4294967295 ses=4294967295 subj=unconfined res=1
Aug 05 01:46:59 pc.site auditctl[3127]: enabled 0
Aug 05 01:46:59 pc.site auditctl[3127]: failure 1
Aug 05 01:46:59 pc.site auditctl[3127]: pid 0
Aug 05 01:46:59 pc.site auditctl[3127]: rate_limit 0
Aug 05 01:46:59 pc.site auditctl[3127]: backlog_limit 64
Aug 05 01:46:59 pc.site auditctl[3127]: lost 0
Aug 05 01:46:59 pc.site auditctl[3127]: backlog 1
Aug 05 01:46:59 pc.site auditctl[3127]: backlog_wait_time 15000
Aug 05 01:46:59 pc.site auditctl[3127]: backlog_wait_time_actual 0
Aug 05 01:46:59 pc.site auditctl[3127]: No rules
Aug 05 01:46:59 pc.site kernel: audit: type=1305 audit(1691192819.853:563): op=set audit_enabled=0 old=1 auid=4294967295 ses=4294967295 subj=unconfined res=1
Aug 05 01:46:59 pc.site systemd[1]: auditd.service: Deactivated successfully.
Aug 05 01:46:59 pc.site systemd[1]: Stopped Security Auditing Service.
Aug 05 01:46:59 pc.site systemd[1]: systemd-tmpfiles-setup.service: Deactivated successfully.
Aug 05 01:46:59 pc.site systemd[1]: Stopped Create Volatile Files and Directories.
Aug 05 01:47:00 pc.site systemd[1]: dracut-shutdown.service: Deactivated successfully.
Aug 05 01:47:00 pc.site systemd[1]: Stopped Restore /run/initramfs on shutdown.
Aug 05 01:47:00 pc.site systemd[1]: Stopped target Local File Systems.
Aug 05 01:47:00 pc.site systemd[1]: Unmounting /home...
...

The LUKS device cr_md-uuid-fe1588c4:a2140388:5b4117ba:4e4339b9-part5 contains the PV for the system VG, so it makes no sense trying to deactivate that before unmounting the system filesystems.

Layout is like this:
# pvs
  PV                                                               VG     Fmt  Attr PSize   PFree
  /dev/mapper/cr_md-uuid-fe1588c4:a2140388:5b4117ba:4e4339b9-part5 system lvm2 a--  199,98g 3,88g
# vgs
  VG     #PV #LV #SN Attr   VSize   VFree
  system   1   6   0 wz--n- 199,98g 3,88g
# lvs
  LV       VG     Attr       LSize   Pool     Origin Data%  Meta%  Move Log Cpy%Sync Convert
  android  system Vwi-aotz--  50,00g thinpool        55,19                                  
  floppies system Vwi-a-tz--   1,00g thinpool        15,18                                  
  home     system -wi-ao----  50,00g                                                        
  root     system -wi-ao----  32,00g                                                        
  swap     system -wi-ao----   4,00g                                                        
  thinpool system twi-aotz-- 110,00g                 25,23  22,76                           

(System was upgraded from Leap 15.4 where shutdown worked)
Comment 1 Takashi Iwai 2023-08-07 14:26:09 UTC
Is the issue reproducible reliably?  If yes, can you verify whether the bug persists when you boot with Leap 15.4 kernel instead?
Comment 2 Ulrich Windl 2023-08-08 05:50:46 UTC
(In reply to Takashi Iwai from comment #1)
> Is the issue reproducible reliably?  If yes, can you verify whether the bug
> persists when you boot with Leap 15.4 kernel instead?

Actually I'm unsure whether at that (shutdown/power-off) point it's systemd handing over to dracut (and then to kernel) to power off, or talking to the kernel directly.
Somehow I think it's not the kernel, but I can try...
So far the issue happened two times (ot of two attempt to power off); that is 100% 8-(
Comment 3 Ulrich Windl 2023-08-10 20:40:06 UTC
Trying with
5.14.21-150400.24.74-default
instead of
5.14.21-150500.55.12-default
power turned off!
Comment 4 Ulrich Windl 2023-08-11 06:28:42 UTC
Also, when back at the official kernel, restart also failed. I saw a "panic-like message" on ttyy10 when nothing seemed to happen at the end of shutdown/restart.
I also noticed that the harddisk light (i.e.: the controller's light on the chassis) was permanently on in that state as if something would be written to (or read from) disk. But as it was not flickering in any way, it could also mean that something had crashed.
I made a photo of the screen, but it's a bit out of focus...
Comment 5 Ulrich Windl 2023-08-17 09:11:17 UTC
Created attachment 868853 [details]
Screenshot of tty10 when shutdown "is finished" (PDF)

The current (newer) kernel did not fix the issue.  However on tty10 I can see a "panic-like" message, unsure where it comes from exactly (I cannot scroll back on tty10).
(I apologize for sending the screenshot as PDF, but the image from the mobile phone was really large, and I have to better tools here to scale it down)
Comment 6 Ulrich Windl 2023-08-17 09:13:28 UTC
(In reply to Ulrich Windl from comment #5)
> (...) and I have to better tools here (...)

Read "(...) _no_ better tools here (...)", sorry.
Comment 7 Takashi Iwai 2023-08-18 14:23:43 UTC
(In reply to Ulrich Windl from comment #5)
> Created attachment 868853 [details]
> Screenshot of tty10 when shutdown "is finished" (PDF)
> 
> The current (newer) kernel did not fix the issue.  However on tty10 I can
> see a "panic-like" message, unsure where it comes from exactly (I cannot
> scroll back on tty10).

Thanks.  It looks like some network stuff.  Does it change the behavior if you turn down the network manually before reboot?
Comment 8 Ulrich Windl 2023-08-25 05:49:37 UTC
I disabled WLAN before reboot, but the effect was the same: The harddisk LED was illuminated (constantly), and the system did not finish restart.
This time I was not able to switch to any virtual console, but the keyboard was still alive (I could toggle num-lock).
Comment 9 Takashi Iwai 2023-08-25 07:33:33 UTC
Hm, then the trace was a red herring?  You can try to turn off the whole network devices, not only WiFi but Ethernet if present, and unload the driver modules manually before the shutdown.  If the shutdown hang still happens, the problem could be something else.
Comment 10 Ulrich Windl 2024-05-24 21:14:28 UTC
Some update has fixed the problem, so it does not happen any more.
Comment 11 Takashi Iwai 2024-05-27 08:46:19 UTC
Good to hear, let's close for now.