Bug 1174788 - Bootloader configuration breaks when PackageKit update is interrupted in the middle
Summary: Bootloader configuration breaks when PackageKit update is interrupted in the ...
Status: NEW
Alias: None
Product: openSUSE Tumbleweed
Classification: openSUSE
Component: Bootloader (show other bugs)
Version: Current
Hardware: Other Other
: P5 - None : Normal (vote)
Target Milestone: ---
Assignee: Gary Ching-Pang Lin
QA Contact: E-mail List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-08-01 00:51 UTC by Nathaniel Graham
Modified: 2021-04-23 16:20 UTC (History)
6 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments
Scary error screen I saw (1.96 MB, image/jpeg)
2020-08-01 00:51 UTC, Nathaniel Graham
Details
What I did to boot it manually (2.39 MB, image/jpeg)
2020-08-01 00:52 UTC, Nathaniel Graham
Details
`efibootmgr` output (annotated for clarity) (900 bytes, text/x-log)
2020-08-01 00:52 UTC, Nathaniel Graham
Details
pbl.log file (75.78 KB, text/x-log)
2020-08-28 21:45 UTC, Nathaniel Graham
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Nathaniel Graham 2020-08-01 00:51:57 UTC
Created attachment 840272 [details]
Scary error screen I saw

After yesterday's Tumbleweed update, my laptop was rendered unbootable. It displayed a scary Windows-style blue error screen saying that the system was broken. There is no Windows partition on this machine, so it must be some kind of backup EFI error screen or something.

I was able to boot into the EFI shell of a recovery USB flash disk I had lying around and then boot into my openSUSE TW OS by doing the following in the EFI shell

fs0:
cd EFI
cd opensuse
grubX64.efi

And then I saw the normal openSUSE kernel chooser boot manager and was able to boot normally. So this bootloader works just fine. However for some reason this bootloader is no longer automatically used anymore after the update.
Comment 1 Nathaniel Graham 2020-08-01 00:52:24 UTC
Created attachment 840273 [details]
What I did to boot it manually
Comment 2 Nathaniel Graham 2020-08-01 00:52:54 UTC
Created attachment 840274 [details]
`efibootmgr` output (annotated for clarity)
Comment 3 Ahmad Samir 2020-08-01 11:13:13 UTC
It might be that something removed the opensuse boot entry. To add it back (from the recovery usb or from a booted TW):
efibootmgr --create --disk /dev/sda --label opensuse --part 1 --loader '\EFI\opensuse\grubx64.efi'

--disk /dev/sda
replace /dev/sda with the actual device node (probably nvme0 or something like that).

--part 1
the EFI partition, usually the first partition on the disk (in my case /dev/sda1), this is the partition containing the loader (grubx64.efi)

--loader '\EFI\opensuse\grubx64.efi'
must be like that, with the backslashes and the single quotes (I can't remember if double quotes work or not, so).

See efibootmgr --help for more details.

From the screenshot I noticed that the fw dir has been modified recently, it could well be that a firmware updated caused the opensuse entry to get removed (I've had that happen before; also sometimes when I reset the firmware to defaults using the button/jumper on my PC motherboard).
Comment 4 Ahmad Samir 2020-08-01 11:26:46 UTC
Or after you boot your TW installation, you could try running (IIRC that worked too, but since I can't find any documentation for this command, the use AT YOUR OWN RISK disclaimer is in effect here :)):
/sbin/update-bootloader --reinit

FWIW, this command is use in the post install rpm scriptlets for the grub2-x86_64-efi package; `rpm -q --scripts grub2-x86_64-efi` to see the whole storyboard.
Comment 5 Nathaniel Graham 2020-08-01 20:12:11 UTC
Many thanks, visiting the YaST bootloader page and closing it again re-added the correct EFI entry.

There was indeed a firmware upgrade recently, so maybe that caused it? In which case, whose fault would that be? The firmware vendor? fwupd itself?
Comment 6 Ahmad Samir 2020-08-01 20:21:15 UTC
My "guess" would be the firmware vendor; but I am not an expert on laptop firmware updates or fwupd. :)
Comment 7 Nathaniel Graham 2020-08-28 00:12:11 UTC
This just happened again after another system update, but there was no firmware update involved. So I think we can rule that out.
Comment 8 Fabian Vogt 2020-08-28 08:08:33 UTC
(In reply to Nathaniel Graham from comment #7)
> This just happened again after another system update, but there was no
> firmware update involved. So I think we can rule that out.

That's weird, because those updates only add or modify entries, I'm not aware of any code to remove entries.

Does the issue reappear after "update-bootloader --reinit" (it's safe to run, if it screws something up we have a bigger problem)? If not, does it reappear after "zypper in --force grub2-x86_64-efi shim"?
Comment 9 Gary Ching-Pang Lin 2020-08-28 08:43:15 UTC
It seems that "Secure Boot support" was disabled during installation. I suppose that Secure Boot is disabled in the firmware, or grubx64.efi won't be loaded.

It's strange to me that "Linux-Firmware-Updater" still exists but the boot entry to grubx64.efi was gone.

Could you post "/var/log/pbl.log"? That file may provide some hints.

I wonder if the firmware removed/ignored the boot entry it doesn't like. Lenovo did that before...
Comment 10 Nathaniel Graham 2020-08-28 21:43:42 UTC
(In reply to Fabian Vogt from comment #8)
> Does the issue reappear after "update-bootloader --reinit" (it's safe to
> run, if it screws something up we have a bigger problem)?
No, the problem does not re-appear after running this command.

> If not, does it
> reappear after "zypper in --force grub2-x86_64-efi shim"?
Nope, not this one either. :p
Comment 11 Nathaniel Graham 2020-08-28 21:45:23 UTC
Created attachment 841183 [details]
pbl.log file
Comment 12 Gary Ching-Pang Lin 2020-08-31 03:32:32 UTC
Per pbl.log, "update-bootloader --reinit" was run at "2020-08-27 17:56:56", and it's the last log before comment#7. However, the issue seems fixed after running the command again.

Would you mind to post the output of "efibootmgr" again? Just want to make sure if the boot entry was restored or not.
Comment 13 Nathaniel Graham 2020-08-31 04:04:06 UTC
$ sudo efibootmgr
[sudo] password for root: 
BootCurrent: 0001
Timeout: 0 seconds
BootOrder: 0001,001F,0019,001B,001C,001D,001E,0020,001A,0000,0021,0022,0023,0024,0002
Boot0000* Windows Boot Manager
Boot0001* opensuse
Boot0002* Linux-Firmware-Updater
Boot0010  Setup
Boot0011  Boot Menu
Boot0012  Diagnostic Splash Screen
Boot0013  Lenovo Diagnostics
Boot0014  Regulatory Information
Boot0015  ThinkShield secure wipe
Boot0016  Startup Interrupt Menu
Boot0017  Rescue and Recovery
Boot0018  MEBx Hot Key
Boot0019* USB CD
Boot001A* USB FDD
Boot001B* NVMe0
Boot001C* NVMe1
Boot001D* ATA HDD0
Boot001E* ATA HDD1
Boot001F* USB HDD
Boot0020* PXE BOOT
Boot0021* HTTPS BOOT
Boot0022* LENOVO CLOUD
Boot0023  Other CD
Boot0024  Other HDD
Boot0025* IDER BOOT CDROM
Boot0026* IDER BOOT Floppy
Boot0027* ATA HDD
Boot0028* ATAPI CD
Comment 14 Gary Ching-Pang Lin 2020-08-31 05:01:42 UTC
Ok, the boot entry was generated correctly now. Though it's still unclear why it's gone before...
Comment 15 Nathaniel Graham 2020-10-04 19:21:28 UTC
Hit it again after I attempted to update my system using Discover. Discover showed me a message that the packagekit daemon has crashed. I quit Discover and finished the update with zypper (`sudo zypper dup`) and then after rebooting, I hit the issue again.
Comment 16 Nathaniel Graham 2021-04-23 16:20:40 UTC
I have now had this happen several times, and I believe I may have managed to identify the proximate cause: it's when when big PackageKit updates are interrupted in the middle. I deliberately did one with `pkcon update` on the command line, killed it in the middle, and then started it again, and got the problem to happen. Since Discover uses PackageKit, whenever the update gets interrupted in the middle or Discover crashes during an update, this will happen.

There is probably a certain package or update action that breaks in this situation, but I have not been able to identify exactly which one it is.