Bugzilla – Bug 1174788
Bootloader configuration breaks when PackageKit update is interrupted in the middle
Last modified: 2021-04-23 16:20:40 UTC
Created attachment 840272 [details] Scary error screen I saw After yesterday's Tumbleweed update, my laptop was rendered unbootable. It displayed a scary Windows-style blue error screen saying that the system was broken. There is no Windows partition on this machine, so it must be some kind of backup EFI error screen or something. I was able to boot into the EFI shell of a recovery USB flash disk I had lying around and then boot into my openSUSE TW OS by doing the following in the EFI shell fs0: cd EFI cd opensuse grubX64.efi And then I saw the normal openSUSE kernel chooser boot manager and was able to boot normally. So this bootloader works just fine. However for some reason this bootloader is no longer automatically used anymore after the update.
Created attachment 840273 [details] What I did to boot it manually
Created attachment 840274 [details] `efibootmgr` output (annotated for clarity)
It might be that something removed the opensuse boot entry. To add it back (from the recovery usb or from a booted TW): efibootmgr --create --disk /dev/sda --label opensuse --part 1 --loader '\EFI\opensuse\grubx64.efi' --disk /dev/sda replace /dev/sda with the actual device node (probably nvme0 or something like that). --part 1 the EFI partition, usually the first partition on the disk (in my case /dev/sda1), this is the partition containing the loader (grubx64.efi) --loader '\EFI\opensuse\grubx64.efi' must be like that, with the backslashes and the single quotes (I can't remember if double quotes work or not, so). See efibootmgr --help for more details. From the screenshot I noticed that the fw dir has been modified recently, it could well be that a firmware updated caused the opensuse entry to get removed (I've had that happen before; also sometimes when I reset the firmware to defaults using the button/jumper on my PC motherboard).
Or after you boot your TW installation, you could try running (IIRC that worked too, but since I can't find any documentation for this command, the use AT YOUR OWN RISK disclaimer is in effect here :)): /sbin/update-bootloader --reinit FWIW, this command is use in the post install rpm scriptlets for the grub2-x86_64-efi package; `rpm -q --scripts grub2-x86_64-efi` to see the whole storyboard.
Many thanks, visiting the YaST bootloader page and closing it again re-added the correct EFI entry. There was indeed a firmware upgrade recently, so maybe that caused it? In which case, whose fault would that be? The firmware vendor? fwupd itself?
My "guess" would be the firmware vendor; but I am not an expert on laptop firmware updates or fwupd. :)
This just happened again after another system update, but there was no firmware update involved. So I think we can rule that out.
(In reply to Nathaniel Graham from comment #7) > This just happened again after another system update, but there was no > firmware update involved. So I think we can rule that out. That's weird, because those updates only add or modify entries, I'm not aware of any code to remove entries. Does the issue reappear after "update-bootloader --reinit" (it's safe to run, if it screws something up we have a bigger problem)? If not, does it reappear after "zypper in --force grub2-x86_64-efi shim"?
It seems that "Secure Boot support" was disabled during installation. I suppose that Secure Boot is disabled in the firmware, or grubx64.efi won't be loaded. It's strange to me that "Linux-Firmware-Updater" still exists but the boot entry to grubx64.efi was gone. Could you post "/var/log/pbl.log"? That file may provide some hints. I wonder if the firmware removed/ignored the boot entry it doesn't like. Lenovo did that before...
(In reply to Fabian Vogt from comment #8) > Does the issue reappear after "update-bootloader --reinit" (it's safe to > run, if it screws something up we have a bigger problem)? No, the problem does not re-appear after running this command. > If not, does it > reappear after "zypper in --force grub2-x86_64-efi shim"? Nope, not this one either. :p
Created attachment 841183 [details] pbl.log file
Per pbl.log, "update-bootloader --reinit" was run at "2020-08-27 17:56:56", and it's the last log before comment#7. However, the issue seems fixed after running the command again. Would you mind to post the output of "efibootmgr" again? Just want to make sure if the boot entry was restored or not.
$ sudo efibootmgr [sudo] password for root: BootCurrent: 0001 Timeout: 0 seconds BootOrder: 0001,001F,0019,001B,001C,001D,001E,0020,001A,0000,0021,0022,0023,0024,0002 Boot0000* Windows Boot Manager Boot0001* opensuse Boot0002* Linux-Firmware-Updater Boot0010 Setup Boot0011 Boot Menu Boot0012 Diagnostic Splash Screen Boot0013 Lenovo Diagnostics Boot0014 Regulatory Information Boot0015 ThinkShield secure wipe Boot0016 Startup Interrupt Menu Boot0017 Rescue and Recovery Boot0018 MEBx Hot Key Boot0019* USB CD Boot001A* USB FDD Boot001B* NVMe0 Boot001C* NVMe1 Boot001D* ATA HDD0 Boot001E* ATA HDD1 Boot001F* USB HDD Boot0020* PXE BOOT Boot0021* HTTPS BOOT Boot0022* LENOVO CLOUD Boot0023 Other CD Boot0024 Other HDD Boot0025* IDER BOOT CDROM Boot0026* IDER BOOT Floppy Boot0027* ATA HDD Boot0028* ATAPI CD
Ok, the boot entry was generated correctly now. Though it's still unclear why it's gone before...
Hit it again after I attempted to update my system using Discover. Discover showed me a message that the packagekit daemon has crashed. I quit Discover and finished the update with zypper (`sudo zypper dup`) and then after rebooting, I hit the issue again.
I have now had this happen several times, and I believe I may have managed to identify the proximate cause: it's when when big PackageKit updates are interrupted in the middle. I deliberately did one with `pkcon update` on the command line, killed it in the middle, and then started it again, and got the problem to happen. Since Discover uses PackageKit, whenever the update gets interrupted in the middle or Discover crashes during an update, this will happen. There is probably a certain package or update action that breaks in this situation, but I have not been able to identify exactly which one it is.