Bug 1219807 - Fail FDE predictions on MicroOS
Summary: Fail FDE predictions on MicroOS
Status: CONFIRMED
Alias: None
Product: openSUSE Tumbleweed
Classification: openSUSE
Component: Security (show other bugs)
Version: Current
Hardware: Other Other
: P5 - None : Normal (vote)
Target Milestone: ---
Assignee: Alberto Planas Dominguez
QA Contact: E-mail List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-02-12 08:19 UTC by Alberto Planas Dominguez
Modified: 2024-03-23 08:13 UTC (History)
2 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments
Script to start VM (2.11 KB, application/x-shellscript)
2024-02-12 17:47 UTC, Andrei Borzenkov
Details
rdsosreport from failure to configure root (144.42 KB, text/plain)
2024-03-23 08:13 UTC, Andrei Borzenkov
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Alberto Planas Dominguez 2024-02-12 08:19:50 UTC
After some significant change the LUKS2 password is asked every time. The JSON timestamp from /boot/efi/EFI/systemd seems new and the issue is fixed with "sdbootutil update-predictions"
Comment 1 Alberto Planas Dominguez 2024-02-12 08:29:39 UTC
@Andrei, just to confirm, your image comes from https://build.opensuse.org/package/show/devel:microos:images/openSUSE-MicroOS ?
Comment 2 Andrei Borzenkov 2024-02-12 08:35:25 UTC
I used the link from the initial announcement:

http://download.opensuse.org/tumbleweed/appliances/openSUSE-MicroOS.x86_64-kvm-and-xen-sdboot.qcow2

Since then, it was being updated from the repositories defined in this image.
Comment 3 Andrei Borzenkov 2024-02-12 17:47:56 UTC
Created attachment 872677 [details]
Script to start VM

Attached is the script I use to start VM. I use ovmf-x86_64-ms-4m-code.bin from openSUSE qemu-ovmf package, I believe it should be qemu-ovmf-x86_64-202308-Virt.1699.263.34.noarch.rpm. The host is Ubuntu 22.04, up to date (with HWE kernel). LD_LIBRARY_PATH is leftover from the local virgl library, I do not think it should be relevant here.
Comment 4 Alberto Planas Dominguez 2024-02-12 21:58:37 UTC
I am trying to reproduce the issue, but I am sure that it is in the VM itself and not in the environment.

The big candidates are the kernel, initrd or the kernel cmdline (that is pcr4 and pcr9)
Comment 5 Alberto Planas Dominguez 2024-02-19 13:08:03 UTC
@Andrei, do you have secure boot enabled?
Comment 6 Andrei Borzenkov 2024-02-19 13:12:11 UTC
(In reply to Alberto Planas Dominguez from comment #5)
> @Andrei, do you have secure boot enabled?

Yes, I do.
Comment 7 Alberto Planas Dominguez 2024-02-21 17:40:48 UTC
I reproduced it today. Seems that pcr-oracle fails to measure the kernel when secure boot is enabled, as shim is not publishing the DevicePath with the kernel path information.

That suggest an issue in shim.
Comment 8 Alberto Planas Dominguez 2024-02-21 17:53:03 UTC
(In reply to Alberto Planas Dominguez from comment #7)

> That suggest an issue in shim.

... or not ... the kernel is selected by sd-boot, so the bug should be there (found by Ludwig)
Comment 9 Andrei Borzenkov 2024-02-24 08:48:59 UTC
After the latest update `sdbootutil update-predictions` stopped working entirely. It expects `/etc/sysconfig/fde-tools` to exist and this file does not exist, so `sdbootutil update-predictions` does nothing. I have no idea which package is supposed to provide this file. I presume it is `fde-tools`, but if `sdbootutil` needs this package, it is not listed in RPM requirements and is not installed.

sdbootutil-1+git20240215.cb7e392-1.1.x86_64
Comment 10 Alberto Planas Dominguez 2024-02-26 05:05:35 UTC
(In reply to Andrei Borzenkov from comment #9)

> It expects `/etc/sysconfig/fde-tools` to exist

This file should be created by disk-encryption-tool, but it is true that there is no migration mechanism in place.

For now do this:

echo "FDE_SEAL_PCR_LIST=0,2,4,7,9" > /etc/sysconfig/fde-tools


This file is required for unifying the pcr-oracle signed policies with the pcrlock ones. The list of PCRs are now in a single file. Maybe this change in the future, tho.
Comment 11 Andrei Borzenkov 2024-02-26 07:54:25 UTC
(In reply to Alberto Planas Dominguez from comment #10)
> 
> This file should be created by disk-encryption-tool, but it is true that
> there is no migration mechanism in place.
> 

Any plans to implement it?

> For now do this:
> 

I'd rather leave it as is to test migration. Assuming it is planned at all.
Comment 12 Alberto Planas Dominguez 2024-02-26 08:53:45 UTC
(In reply to Andrei Borzenkov from comment #11)
> (In reply to Alberto Planas Dominguez from comment #10)
> > 
> > This file should be created by disk-encryption-tool, but it is true that
> > there is no migration mechanism in place.
> > 
> 
> Any plans to implement it?

Until now none from my side. The tools are still converging and is kind of OK to break some promises while we are searching for a better solution.

For example, the file is in some sense a step back from the current solution, so I am rooting to change the file again when I devise something better (maybe register the PCRs in the LUKS2 header also when pcrlock is used)

> > For now do this:
> > 
> 
> I'd rather leave it as is to test migration. Assuming it is planned at all.

OK. I will prepare something then.
Comment 13 Alberto Planas Dominguez 2024-02-26 15:16:56 UTC
A workaround of the original issue in https://github.com/okirch/pcr-oracle/pull/51

I am releasing a new package with the fix
Comment 14 Andrei Borzenkov 2024-02-26 18:48:11 UTC
(In reply to Alberto Planas Dominguez from comment #13)
> A workaround of the original issue in
> https://github.com/okirch/pcr-oracle/pull/51
> 

The comments are misleading. It is not "secure boot" problem, but shim problem. The EV_EFI_BOOT_SERVICES_APPLICATION event is logged by shim_verify() when it verifies binary; although firmware passes device path to security function, shim protocol only accepts memory buffer and is not aware of image location.

Call chain is sd-boot -> shim_validate (via firmware security override) -> shim_verify.

As far as I can tell, there is nothing that identifies this event as related to kernel load. shim can be built with OVERRIDE_SECURITY_POLICY in which case it will log every executable loaded (including sd-boot itself) usinf such EV_EFI_BOOT_SERVICES_APPLICATION with empty device path.

May be it should be discussed with shim developers.
Comment 15 Alberto Planas Dominguez 2024-02-27 05:41:37 UTC
(In reply to Andrei Borzenkov from comment #14)
> (In reply to Alberto Planas Dominguez from comment #13)
> > A workaround of the original issue in
> > https://github.com/okirch/pcr-oracle/pull/51
> > 
> 
> The comments are misleading. It is not "secure boot" problem, but shim
> problem. 

Yes you are right.

> As far as I can tell, there is nothing that identifies this event as related
> to kernel load. 

There is none. The only clue is the hash itself, that could be mapped to the old kernel.

> shim can be built with OVERRIDE_SECURITY_POLICY in which
> case it will log every executable loaded (including sd-boot itself) usinf
> such EV_EFI_BOOT_SERVICES_APPLICATION with empty device path.

Yes. This will invalidate the workaround
Comment 16 Alberto Planas Dominguez 2024-02-27 07:44:40 UTC
(In reply to Alberto Planas Dominguez from comment #15)
> (In reply to Andrei Borzenkov from comment #14)
>
> > The comments are misleading. It is not "secure boot" problem, but shim
> > problem. 
> 
> Yes you are right.

I updated the commit here:

https://github.com/okirch/pcr-oracle/pull/51/commits/211502ec5cac7e252f8af251ee34872f7adae9ca

If you think that this description is more accurate I will create a bug report in the shim project based on this.
Comment 17 Andrei Borzenkov 2024-02-27 08:12:53 UTC
(In reply to Alberto Planas Dominguez from comment #16)
> (In reply to Alberto Planas Dominguez from comment #15)
> > (In reply to Andrei Borzenkov from comment #14)
> >
> > > The comments are misleading. It is not "secure boot" problem, but shim
> > > problem. 
> > 
> > Yes you are right.
> 
> I updated the commit here:
> 

Yes, I believe it is OK.

Re grub2 - root unlock from within grub2 happens much earlier, before kernel is loaded; later SUSE grub2 forwards encryption key to initrd directly, so it does not rely on TPM2 measurements to unlock root in initrd. See

https://build.opensuse.org/package/view_file/openSUSE:Factory/grub2/0009-Add-crypttab_entry-to-obviate-the-need-to-input-pass.patch?expand=1
Comment 18 Alberto Planas Dominguez 2024-02-27 08:39:37 UTC
(In reply to Andrei Borzenkov from comment #17)

> Yes, I believe it is OK.

Great. I will create the issue then. Thanks for the review!
 
> Re grub2

Right, that is how fde works with grub2, but the comment was more about the grub2-shim interaction when loading the kernel.

The pcr-oracle workaround works under the assumption that there is only one pcr4 extension of type boot services application that has this issue (the kernel).  This can be invalidated if grub2 is following a different protocol.
Comment 19 Andrei Borzenkov 2024-02-27 08:52:22 UTC
(In reply to Alberto Planas Dominguez from comment #18)
> 
> Right, that is how fde works with grub2, but the comment was more about the
> grub2-shim interaction when loading the kernel.
> 
> The pcr-oracle workaround works under the assumption that there is only one
> pcr4 extension of type boot services application that has this issue (the
> kernel).  This can be invalidated if grub2 is following a different protocol.

Currently grub only uses shim to verify file of type "kernel" or other EFI binary when chainloading it. So, in normal case of shim -> grub -> kernel there should be only one such event for the kernel.
Comment 20 Alberto Planas Dominguez 2024-02-27 09:23:14 UTC
https://github.com/rhboot/shim/issues/642
Comment 21 Andrei Borzenkov 2024-03-03 15:26:42 UTC
For the record. Usually the first reboot after new kernel fails completely - it does not even ask for the encrypted root password, it simply stops in initrd with failed cryptsetup. I reboot with Ctrl-Alt-Del and this time it asks for the password. Just happened again.
Comment 22 Alberto Planas Dominguez 2024-03-04 15:26:20 UTC
(In reply to Andrei Borzenkov from comment #21)

> it simply stops in initrd with failed cryptsetup

I confirm this in an old installation, but I am not able to reproduce this in new images.

Are you able to extract any log output, maybe breaking the initrd load with rd.break=pre-mount, so something like this?

I had the feeling that this was a race condition, but I am not sure if this happens in the cryptsetup generator or before.
Comment 23 Andrei Borzenkov 2024-03-23 08:13:29 UTC
Created attachment 873748 [details]
rdsosreport from failure to configure root

(In reply to Alberto Planas Dominguez from comment #22)
> 
> Are you able to extract any log output, maybe breaking the initrd load with
> rd.break=pre-mount, so something like this?
> 

I managed to collect rdsosreport when it failed after another update.