Bug 1193684 - MokManager trouble when upgrading from non-SB enabled system (15.0 aarch64)
MokManager trouble when upgrading from non-SB enabled system (15.0 aarch64)
Status: CONFIRMED
Classification: openSUSE
Product: openSUSE Distribution
Classification: openSUSE
Component: Installation
Leap 15.4
Other Other
: P5 - None : Minor (vote)
: ---
Assigned To: YaST Team
Jiri Srain
https://openqa.opensuse.org/tests/207...
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2021-12-13 15:18 UTC by Fabian Vogt
Modified: 2021-12-20 10:17 UTC (History)
8 users (show)

See Also:
Found By: openQA
Services Priority:
Business Priority:
Blocker: Yes
Marketing QA Status: ---
IT Deployment: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Fabian Vogt 2021-12-13 15:18:06 UTC
This test uses the KDE Live medium to upgrade a 15.0 system to 15.4.
While the root cause is known and a fix is in progress (boo#1187515), I have
some questions about the observed failure:

The 15.0 system does not have secure boot enabled, but during the upgrade to
15.4, SB gets enabled apparently: shim is installed into the ESP and there is a
MOK request to enroll the openSUSE certificate. Is this intentional?

After triggering the reboot, the firmware boots into the Live CD again, most
likely because openQA sets bootindex=0 on the device. The boot fails because of
bug#1187515, KIWI didn't copy MokManager.efi. However, I wonder whether it's
expected here that it does not proceed into the upgraded system. Is it
necessary to do a hard system reset if MokManager.efi was not found, instead of
proceeding with the boot order as usual?

Failed to start MokManager: Not Found
Something has gone seriously wrong: import_mok_state() failed: Not Found
DXE ResetSystem2: ResetType Shutdown, Call Depth = 1.

## Observation

openQA test in scenario opensuse-15.4-KDE-Live-aarch64-kde_live_upgrade_leap_15.0@aarch64 fails in
[grub_test](https://openqa.opensuse.org/tests/2077067/modules/grub_test/steps/4)

## Test suite description
Uses the live installer on the kde live media for upgrading the system.

+QEMURAM=2048 is necessary as the HDD_1 is only available for aarch64/64-bit and more RAM is necessary.


## Reproducible

Fails since (at least) Build [4.9](https://openqa.opensuse.org/tests/2010746)


## Expected result

Last good: (unknown) (or more recent)


## Further details

Always latest result in this scenario: [latest](https://openqa.opensuse.org/tests/latest?arch=aarch64&distri=opensuse&flavor=KDE-Live&machine=aarch64&test=kde_live_upgrade_leap_15.0&version=15.4)
Comment 1 Michael Chang 2021-12-14 07:33:12 UTC
It seems to be offline migration via installation media that YaST would re-proposed or re-initialized settings from scratch for some reason. Maybe we should check with YaST team first if they know secure boot settings would be altered in the migration process ?

@ Hi YaST Team,

Would you please help to check that secure boot setting could be altered during offline migration ?

For the second question. To my understanding, openQA testcase has been conducted to use "Boot from Hard Disk" from the media to booting into upgraded system for a long time. I have raised the question in other bug report why it couldn't just reboot into upgraded system but it seems there are some difficulties to do so.

I think grub-install or shim-install has made their boot order higher than cdrom device, but given that only cdrom has bootindex=0 attached, the firmware would not enumerate other device without bootindex if booting unattended. For this reason I don't think hard system reset would work to boot into updated system and instead would be trapped in cdrom over again ...
Comment 2 Stefan Hundhammer 2021-12-15 09:18:09 UTC
jreidinger should know.

Josef?
Comment 3 Fabian Vogt 2021-12-16 10:14:05 UTC
(In reply to Michael Chang from comment #1)
> It seems to be offline migration via installation media that YaST would
> re-proposed or re-initialized settings from scratch for some reason. Maybe
> we should check with YaST team first if they know secure boot settings would
> be altered in the migration process ?
> 
> @ Hi YaST Team,
> 
> Would you please help to check that secure boot setting could be altered
> during offline migration ?

It's actually visible in the openQA test as well that YaST saves the wrong setting:

https://openqa.opensuse.org/tests/2076948#step/disable_grub_timeout/12 shows "Secure boot: disabled"
https://openqa.opensuse.org/tests/2076948#step/disable_grub_timeout/16 shows the checkbox for Secure boot remains unchecked
https://openqa.opensuse.org/tests/2076948#step/disable_grub_timeout/21 shows that the 

This is using the regular DVD upgrade, which behaves the same way as the live upgrade linked in the initial report.

Reassigning to the YaST team and raising severity.

> For the second question. To my understanding, openQA testcase has been
> conducted to use "Boot from Hard Disk" from the media to booting into
> upgraded system for a long time. I have raised the question in other bug
> report why it couldn't just reboot into upgraded system but it seems there
> are some difficulties to do so.
>
> I think grub-install or shim-install has made their boot order higher than
> cdrom device, but given that only cdrom has bootindex=0 attached, the
> firmware would not enumerate other device without bootindex if booting
> unattended. For this reason I don't think hard system reset would work to
> boot into updated system and instead would be trapped in cdrom over again ...

Yep, and if the HDD had a bootindex with higher priority, the upgrade couldn't be started at all... There doesn't seem to be a "once" option with bootindex like for "-boot".

One question remains. does shim really have to force-reset the platform if MokManager was not found? It could just print a warning and continue, which would be much more user friendly.
Comment 4 Fabian Vogt 2021-12-16 10:14:39 UTC
(In reply to Fabian Vogt from comment #3)
> (In reply to Michael Chang from comment #1)
> > It seems to be offline migration via installation media that YaST would
> > re-proposed or re-initialized settings from scratch for some reason. Maybe
> > we should check with YaST team first if they know secure boot settings would
> > be altered in the migration process ?
> > 
> > @ Hi YaST Team,
> > 
> > Would you please help to check that secure boot setting could be altered
> > during offline migration ?
> 
> It's actually visible in the openQA test as well that YaST saves the wrong
> setting:
> 
> https://openqa.opensuse.org/tests/2076948#step/disable_grub_timeout/12 shows
> "Secure boot: disabled"
> https://openqa.opensuse.org/tests/2076948#step/disable_grub_timeout/16 shows
> the checkbox for Secure boot remains unchecked
> https://openqa.opensuse.org/tests/2076948#step/disable_grub_timeout/21 shows
> that the 
... overview suddenly has "Secure boot: enabled" set, after leaving the bootloader module.
Comment 5 Josef Reidinger 2021-12-16 22:24:17 UTC
yeap, that looks wrong to have it set to enabled, especially when checkbox is disabled. Sadly the debug logs from openqa already rolled away from proposal, so I do not see in logs why it happens. We need to reproduce it and check why.
Comment 6 Josef Reidinger 2021-12-16 22:27:54 UTC
and few more questions:

1. is it seen only on arch and nothing else?
2. Does it happen only on liveDVD or do you see it also using non live medium?
Comment 7 Michael Chang 2021-12-17 04:29:06 UTC
(In reply to Fabian Vogt from comment #3)
> (In reply to Michael Chang from comment #1)

[snip]

> Yep, and if the HDD had a bootindex with higher priority, the upgrade
> couldn't be started at all... There doesn't seem to be a "once" option with
> bootindex like for "-boot".

I'm curious why -boot once=d cannot be used here and why this is not observed behavior in regular install other than in the openQA.

> One question remains. does shim really have to force-reset the platform if
> MokManager was not found? It could just print a warning and continue, which
> would be much more user friendly.

I agree with you that normally it could just print warning and continue to next boot order as this is what most people expect it to happen after synchronous boot errors in general. However this is shim under many security reviews by Microsoft and other linux distributions so any proposed change to error handling, security violation in particular, is required to have it upstream first imho (whenever worth it).

  RT->ResetSystem(EfiResetShutdown, EFI_SECURITY_VIOLATION,

We can argue that missing mokmanager is not an security violation in secure boot or it should not reset system in reaction to that. I'm not security expert eligible to make such adjustment, security implication is all around for those who audits security in the code.

CC Joey who is currently in charge of shim.
Comment 8 Fabian Vogt 2021-12-17 15:14:34 UTC
(In reply to Josef Reidinger from comment #6)
> and few more questions:
> 
> 1. is it seen only on arch and nothing else?

In x86 tests, secure boot is on by default (in the system, not necessarily the test VM), so this bug wouldn't be visible in openQA. On other platforms there's no secure boot option.

> 2. Does it happen only on liveDVD or do you see it also using non live
> medium?

Comment 3 has links to a plain DVD upgrade test where this happens.
Comment 9 Josef Reidinger 2021-12-17 23:01:48 UTC
(In reply to Fabian Vogt from comment #8)
> (In reply to Josef Reidinger from comment #6)
> > and few more questions:
> > 
> > 1. is it seen only on arch and nothing else?
> 
> In x86 tests, secure boot is on by default (in the system, not necessarily
> the test VM), so this bug wouldn't be visible in openQA. On other platforms
> there's no secure boot option.
> 
> > 2. Does it happen only on liveDVD or do you see it also using non live
> > medium?
> 
> Comment 3 has links to a plain DVD upgrade test where this happens.

Thanks, so I think it is related to fact that we do not support secure boot for arm for 15.0 at all. Then we have specific grub parameter and I think now we support shim. I remember that we already had in past some issues with it like change or software dependencies to install bootloader.
What is for sure bug is showing secure boot as not enabled and then enable it. I worry it is different interpretation of missing parameter in /etc/sysconfig/bootloader and that should be fixed ( not sure if we should always enable secure boot if previously it was not supported ).
Comment 11 Stefan Hundhammer 2021-12-20 10:17:31 UTC
Moved to our Trello task queue for a future sprint.