Bug 1213664 - NVIDIA drivers in kernels earlier than the latest installed version will have the wrong signature after rollback on Tumbleweed with enabled lockdown
Summary: NVIDIA drivers in kernels earlier than the latest installed version will have...
Status: CONFIRMED
Alias: None
Product: openSUSE Tumbleweed
Classification: openSUSE
Component: X11 3rd Party Driver (show other bugs)
Version: Current
Hardware: Other Other
: P3 - Medium : Normal (vote)
Target Milestone: ---
Assignee: Stefan Dirsch
QA Contact: Stefan Dirsch
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-07-26 07:35 UTC by Andrei Borzenkov
Modified: 2024-03-28 03:36 UTC (History)
1 user (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Andrei Borzenkov 2023-07-26 07:35:26 UTC
Now, after lockdown has been enabled on Tumbleweed, the following will happen.

1. On installation NVIDIA drivers generate random key and store it in /var when driver is recompiled. There is only one key for a given version of NVIDIA driver KMP package.

2. Driver is compiled with every kernel update. When it happens old certificate is removed from MOK database and newly generated certificate is installed instead. This means only one certificate for the latest kernel is present in /var and MOK database.

Now, if user will attempt to boot an earlier kernel version, drivers for this kernel will have signature that is no more present in MOK. It can also happen as part of rollback.

Drivers must now be reinstalled again to generate the valid signature for this specific kernel that will automatically invalidate signature for any other kernel version (because it will again remove any certificate currently present in MOK). Besides, building the driver for a version that is not the latest is not trivial either.

The problem does not exist on Leap where driver is compiled just once and the same binary is reused for all future kernel updates.
Comment 1 Stefan Dirsch 2023-07-26 07:45:03 UTC
Indeed. This is true. That would basically mean, that we could never remove any of these certificates. I'm wondering how many certificates can be stored. Is there a limit? Over the years you may get hundreds of these with Tumbleweed ...
Comment 2 Andrei Borzenkov 2023-07-26 08:14:26 UTC
(In reply to Stefan Dirsch from comment #1)
> Indeed. This is true. That would basically mean, that we could never remove
> any of these certificates.

It depends on whether booting an older kernel is considered a routine task or one-off emergency action. In the latter case documenting what needs to be done (and may be providing some script to automate needed steps) could be enough.

> I'm wondering how many certificates can be
> stored. Is there a limit?

They are stored in EFI NVRAM so we are at the mercy of hardware manufacturer. But beyond the total available space there could be arbitrary bugs or implementation restrictions related to maximum variable size, to fragmentation of available space etc. We already had the case of literally bricking users' systems by too aggressive writing into EFI NVRAM. The less you touch it, the better. "Every kernel update" is already too much for my taste.
Comment 3 Stefan Dirsch 2023-07-26 08:50:08 UTC
Thanks. Going back to a previous kernel or rollback for me is an emergency task. But that's just my personal opinion!

In theory with nvidia no longer being loadable you should have still gotten a graphical desktop running on simpledrm (KMS on UEFI fb) with likely reduced resolution, no 3D support, etc. Can you confirm this? I believe for an emergency task this is good enough.
Comment 4 Stefan Dirsch 2023-07-27 12:20:09 UTC
(In reply to Stefan Dirsch from comment #3)
> In theory with nvidia no longer being loadable you should have still gotten
> a graphical desktop running on simpledrm (KMS on UEFI fb) with likely
> reduced resolution, no 3D support, etc. Can you confirm this? I believe for
> an emergency task this is good enough.

--> NEEDINFO
Comment 5 Andrei Borzenkov 2023-07-28 17:47:17 UTC
(In reply to Stefan Dirsch from comment #3)
> Can you confirm this?

I do not have bare metal to test. On VM with installed NVIDIA driver blacklisting virtio_gpu results in text mode boot because NVIDIA installation adds "nosimplefb=1". Removing this kernel parameter boots into full KDE/X11 GUI using simpledrmfb driver.
Comment 6 Stefan Dirsch 2023-07-28 18:36:37 UTC
You're right. We need to disable simpledrm when installing nvidia drivers. But X fbdev driver should still work on UEFI FB.
Comment 7 Andrei Borzenkov 2023-07-28 18:41:00 UTC
(In reply to Stefan Dirsch from comment #6)
> But X fbdev driver should still work on UEFI FB.

Display manager will wait for logind to announce seat availability. logind will not announce seat availability until it finds DRM device or system is booted with "nomodeset".
Comment 8 Stefan Dirsch 2023-07-28 18:54:58 UTC
(In reply to Andrei Borzenkov from comment #7)
> (In reply to Stefan Dirsch from comment #6)
> > But X fbdev driver should still work on UEFI FB.
> 
> Display manager will wait for logind to announce seat availability. logind
> will not announce seat availability until it finds DRM device or system is
> booted with "nomodeset".

This behaviour looks surprising to me. But yeah, to force using UEFI fb/fbdev X driver I always used "nomodeset".
Comment 9 Andrei Borzenkov 2023-07-29 06:32:09 UTC
(In reply to Stefan Dirsch from comment #8)
> 
> This behaviour looks surprising to me. 

This ship has sailed long ago. Unless someone implements support for changing drivers underneath running GUI, there is not much that can be done here.

Anyway, my point is that booting into full GUI without working NVIDIA drivers is not trivial to put it mildly and we did not even mention suse-prime yet ...

May be generating signing key once and storing it permanently is the least evil after all. Even better would be framework allowing access to securely stored keys (like encrypted filesystem, TPM, SmartCard or simply offline on a USB stick).
Comment 10 Stefan Dirsch 2023-07-29 09:05:41 UTC
Yeah, we could define a framework, just define a directory. And if it exists the private key is stored there (user agrees system becoming unsecure) and can be reused. But we all discussed this before on opensuse-factory ML if I recall correctly. 

Then we can document this all in release notes and on our wiki page, which almost nobody will read ...
Comment 11 Andrei Borzenkov 2023-07-29 12:42:53 UTC
(In reply to Stefan Dirsch from comment #10)
> Yeah, we could define a framework, just define a directory. 

May be we should not reinvent the wheel ...

https://github.com/dell/dkms/blob/master/README.md#module-signing
Comment 12 Stefan Dirsch 2024-03-28 03:36:47 UTC
I'm lacking the time to implement this. In case you want to come up with a concrete proposal. Package sources are available.

https://build.opensuse.org/package/show/X11:Drivers:Video:Redesign/nvidia-driver-G06

https://github.com/openSUSE/nvidia-driver-G06/