Bugzilla – Bug 1222399
snapper rollback fail from sles15 sp6 beta to sles15 sp5
Last modified: 2024-05-07 11:46:11 UTC
Created attachment 874109 [details] log for sp6 upgrade and snapper rollback I am testing the SLE-15-SP6-PublicBeta-202403, and I experience snapper rollback problem. Not sure whether it is the right time to raise ticket to SUSE. Here is my test steps: 1. Install my SLES15 SP5 OEM iso image in lab. 2. Take a snapshot before upgrade to SP6 3. Using plain zypper to upgrade to SLES15 SP6 PublicBeta-202403 by zupper dup 4. Perform a node reboot 5. Take a supportconfig file 6. Rollback to SP5 snapshot 7. Reboot the node but the node failed to boot up with below error. The rollback on the same SP does not have any problem. In my SP5 node, I create a snapshot, make some change, then rollback SP5 snapshot. The snapshot can be rollback successfully after node reboot. After I upgrade to SP6, I create a snapshot, make some change, then rollback S6 snapshot. The snapshot can be rollback successfully after node reboot. The problem only occur when I want to rollback to the SP5 snapshot after I perform the SP6 upgrade.
Created attachment 874110 [details] screen shot of boot up failure
Created attachment 874111 [details] supportconfig from the system after sp6 upgrade
I have tried the SP6 RC1 version which was released 3-28. It has the same problem.
That error message means nothing to me and is not from snapper. I also cannot see what should have failed during the actual rollback. Looks more like a bootloader problem.
Hi Simon, The issue seems to stem from the relocator.mod, part of a memory manager to relocate the kernel in a designated and isolated memory region. This is particularly significant in the context of legacy bios booting, where control over the hand over process is essential ensuring that these objects are loaded into appropriate memory locations that do not conflict with other vital system areas, such as the bootloader itself, firmware's reserved or used regions, or other reserved hardware areas.. It's important to emphasize "legacy bios" because relocator.mod is an external module loaded from the btrfs filesystem due to space constraints in the disk's MBR. The error appears to be caused by an incompatibility between the core image and modules. Following a rollback and reboot, while the filesystem modules are reverted to grub 2.06 (from service pack 5), the core.img embedded in the disk's MBR remains at version 2.12 (from service pack 6). Like the linux kernel, grub does not ensure ABI compatibility between major releases. A plausible explanation is that the relocator.mod, which operates on grub_mm_region_t [1], has an altered definition in version 2.12. This change leads to incorrect field reading, resulting in grub crashes. This also explains why the issue is not reproducible if service pack is not changed. To address potential grub ABI breakages after rolling back the root filesystem, it's essential to exempt the directory containing the installed grub modules from the rollback process. This ensures that grub modules do not revert inadvertently during the rollback of the root subvolume. Could you please verify the layout of your btrfs subvolumes by checking with `btrfs sub list /`? Here are the expected paths: > ID 264 gen 594550 top level 256 path `@/boot/grub2/x86_64-efi` > ID 265 gen 180107 top level 256 path `@/boot/grub2/i386-pc` Also, please confirm the following entries in your `/etc/fstab`: > UUID=.... /boot/grub2/x86_64-efi btrfs subvol=/@/boot/grub2/x86_64-efi 0 0 > UUID=.... /boot/grub2/i386-pc btrfs subvol=/@/boot/grub2/i386-pc 0 0 This configuration should prevent grub's version mismatches and maintain stability. [1] https://git.savannah.gnu.org/cgit/grub.git/commit/?id=052e6068be
(In reply to simon wang from comment #2) > Created attachment 874111 [details] > supportconfig from the system after sp6 upgrade Hi Simon, Apparently the btrfs subvolume is not laid out properly for snapper rollback [1] [2]. See previous comment, subvolumes for grub modules are missing! Are you using customized image for testing ? IMHO it should not happen in an official SLES image. [1] > #==[ Command ]======================================# > # /sbin/btrfs subvolume list / > ID 256 gen 10 top level 5 path @ > ID 257 gen 131 top level 256 path @/.snapshots > ID 258 gen 190 top level 257 path @/.snapshots/1/snapshot > ID 259 gen 190 top level 256 path @/var/log > ID 260 gen 182 top level 256 path @/var/tmp > ID 268 gen 128 top level 257 path @/.snapshots/2/snapshot [2] > #==[ Configuration File ]===========================# > # /etc/fstab > UUID=6689b774-8a7e-40bf-927f-e93f721c4b64 / btrfs defaults,acl,fatal_errors=panic,compress=no,commit=5 0 1 > UUID=6689b774-8a7e-40bf-927f-e93f721c4b64 /.snapshots btrfs defaults,acl,fatal_errors=panic,compress=no,commit=5,subvol=@/.snapshots 0 0 > UUID=6689b774-8a7e-40bf-927f-e93f721c4b64 /var/log btrfs defaults,acl,fatal_errors=panic,compress=no,commit=5,subvol=@/var/log 0 0 > UUID=6689b774-8a7e-40bf-927f-e93f721c4b64 /var/tmp btrfs defaults,acl,fatal_errors=panic,compress=no,commit=5,subvol=@/var/tmp 0 0 Thanks.
I add the /boot/grub2/i386-pc subvolumn in my oem image. This time we don't have any problem of the rollback from SP6 to SP5 snapshot. Thanks. Here I have a question related to the grup 2.12(SP6) in /boot/grub2/i386-pc. After the rollback, would there be any problem If I perform a "zypper update" for the SP5 system? If the grub 2.06(SP5) got updated, would it replace the contents in /boot/grub2/i386-pc?Since the content of /boot/grub2/i386-pc are now grub 2.06, will the next reboot fail then?
(In reply to simon wang from comment #8) > I add the /boot/grub2/i386-pc subvolumn in my oem image. > This time we don't have any problem of the rollback from SP6 to SP5 snapshot. > Thanks. Great. Thanks for the update. > Here I have a question related to the grup 2.12(SP6) in /boot/grub2/i386-pc. > After the rollback, would there be any problem If I perform a "zypper > update" for the SP5 system? No. You should not run into any problem. However It is recommended to verify that the i386-pc subvolume is mounted properly on your SP5. This is important because you may roll back to SP5 snapshot that is too old, predating your fix in /etc/fstab. Use the command to check: > cat /proc/self/mounts | grep i386-pc > If the grub 2.06(SP5) got updated, would it replace the contents in > /boot/grub2/i386-pc? Yes, if there's grub2 update the content will be replaced by new package. Along with the boot directory update, MBR will also be updated so the ABI is in sync. > Since the content of /boot/grub2/i386-pc are now grub > 2.06, will the next reboot fail then? No it won't. It is just a plain reboot using grub 2.06 you used to use it in SP5. Thanks.
The issue is resolved. I Believe we can close this ticket. Thanks.