Bug 1220177

Summary: [Silent Error] transactional-update fails to rebuild kdump initrd due to kdump packaging issue
Product: [openSUSE] openSUSE Tumbleweed Reporter: Pavin Joseph <me>
Component: BasesystemAssignee: Ignaz Forster <iforster>
Status: NEW --- QA Contact: E-mail List <qa-bugs>
Severity: Major    
Priority: P2 - High CC: iforster, jbohac, kukuk
Version: Current   
Target Milestone: ---   
Hardware: x86-64   
OS: openSUSE Tumbleweed   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---

Description Pavin Joseph 2024-02-22 07:28:36 UTC
Overview:

transactional-update (TU) silently fails to rebuild kdump initrd because /var/lib/kdump is not writable in the snapshot.
I'm running TU in a normal read-write Tumbleweed installation to perform dup.
System and packages are full updated as of Feb 22 2024. No failed systemd units or priority 3 errors in journal.

Tumbleweed version: 20240220
transactional-update version: 4.5.0-2.3
tukit version: 4.5.0-2.3

Steps to Reproduce:
sudo transactional-update <kdump | dup>

Logs below are from calling dup where there was new kernel installed, thus necessitating rebuilding of initrd (success), bootloader (sucess), and kdump initrd (silent fail).

Actual results:
TU fails to regenerate kdump initrd as /var/lib/kdump is not writable in the snapshot.

Logs:
Feb 22 03:14:07 suse-pc transactional-update[12262]: Trying to rebuild kdump initrd
Feb 22 03:14:07 suse-pc transactional-update[24392]: 2024-02-22 03:14:07 tukit 4.5.0 started
Feb 22 03:14:07 suse-pc transactional-update[24392]: 2024-02-22 03:14:07 Options: --discard call 274 /sbin/mkdumprd
Feb 22 03:14:07 suse-pc transactional-update[24392]: 2024-02-22 03:14:07 Executing `/sbin/mkdumprd`:
Feb 22 03:14:07 suse-pc transactional-update[24392]: /var/lib/kdump not writable, not regenerating initrd.
Feb 22 03:14:07 suse-pc transactional-update[24392]: 2024-02-22 03:14:07 Application returned with exit status 0.
Feb 22 03:14:07 suse-pc transactional-update[24392]: 2024-02-22 03:14:07 Transaction completed.

There are also other warnings due to /var/lib/... contents being changed in the snapshot but which are not propagated to the new system.

Logs:
Feb 22 03:14:18 suse-pc transactional-update[12262]: Warning: The following files were changed in the snapshot, but are shadowed by
Feb 22 03:14:18 suse-pc transactional-update[12262]: other mounts and will not be visible to the system:
Feb 22 03:14:18 suse-pc transactional-update[12262]: /.snapshots/274/snapshot/var/lib/YaST2/hooks/README.md
Feb 22 03:14:18 suse-pc transactional-update[12262]: /.snapshots/274/snapshot/var/lib/systemd/catalog/database
Feb 22 03:14:18 suse-pc transactional-update[12262]: /.snapshots/274/snapshot/var/lib/systemd/rpm/container-machines_subvol

Expected results:
TU successfully regenerates kdump initrd because /usr/sbin/mkdumprd does not rely on /var/lib/kdump/ directory for updates. I was told by the TU developer in the Github issue (https://github.com/openSUSE/transactional-update/issues/119#issuecomment-1958826104) that this is a packaging issue, probably related to kdump.
Comment 1 Thorsten Kukuk 2024-02-22 08:17:22 UTC
/var is not part of the snapshot, so allowing packages to install there something means, the update is no longer atomic, transactional and the running services can see that an update is going on. So all cases, why we did create transactional-update.
Comment 2 Ignaz Forster 2024-03-07 14:54:49 UTC
Also see https://github.com/openSUSE/transactional-update/issues/119 for most of the discussion.
Comment 3 Jiri Bohac 2024-03-07 15:52:56 UTC
quoting the github discussion...

Laenion writes:
> That's not how it's supposed to work: The kdump initrd should be regenerated during boot right before it is loaded, so you only need to reboot once. If that wouldn't be working that would indeed be a bug...

I second this - this his how it's supposed to work. The only reason to regenerate the initrd prior to the next boot is
to make kdump-early.service work. This just allows the dump to be functional earlier during the boot. On a TU system this
is expected to fail and the initrd to be generated during the start of kdump.service

pavinjosdev writes:
> Would using kexec reboot cause any issues with this process? I set it up as per the Suse docs.

laenion writes:
> Yes, that would indeed result in the behavior you see.

Why is this expected to fail with kexec?
Comment 4 Pavin Joseph 2024-03-08 04:27:23 UTC
(In reply to Jiri Bohac from comment #3)
> laenion writes:
> > Yes, that would indeed result in the behavior you see.
> 
> Why is this expected to fail with kexec?

Jiri, the issue I was having was twofold:
1. kdump initrd fails to be rengerated by transactional-update (TU)
2. kexec reboots into the old kernel

I believe Ignaz (laenion) was referring to the latter of the two when he made that comment. That is, as the kernel update was performed within the snapshot, kexec reboot on the currently running system would still use the old kernel to kexec into.

He reopened the Github issue as TU has a kexec reboot config option that is not working as expected currently.

The workaround I found was to ask TU to apply the new snapshot to the currently running system so it can see the new kernel, and then update kdump initrd (for the new kernel) and kexec reboot (using the new kernel).