Bug 1224404 - Missing framebuffer during boot on aarch64
Summary: Missing framebuffer during boot on aarch64
Status: VERIFIED FIXED
Alias: None
Product: openSUSE Tumbleweed
Classification: openSUSE
Component: Basesystem (show other bugs)
Version: Current
Hardware: Other Other
: P5 - None : Normal (vote)
Target Milestone: ---
Assignee: dracut maintainers
QA Contact: E-mail List
URL: https://openqa.opensuse.org/tests/389...
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-05-17 08:51 UTC by Fabian Vogt
Modified: 2024-06-04 07:25 UTC (History)
2 users (show)

See Also:
Found By: openQA
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Fabian Vogt 2024-05-17 08:51:26 UTC
+++ This bug was initially created as a clone of Bug #1219180 +++

Unless the initrd is built with the platform native DRM driver (e.g. virtio_gpu), there is no graphical output during boot. This also means that entering a passphrase for unlocking the root fs is not possible.

On x86 it uses the EFI framebuffer successfully, but on aarch64 dmesg has no trace of efifb/simplefb/simpledrm.

The original bug report has some more details, but the summary is that efifb or other builtin drivers don't work with virtio-gpu-pci. To get any display output at all, the virtio-gpu module is needed. From a dracut PoV, the crypt module should probably pull in drm if virtio-gpu is detected?

## Observation

openQA test in scenario microos-Tumbleweed-MicroOS-Image-sdboot-aarch64-microos-wizard@aarch64 fails in
[ansible](https://openqa.opensuse.org/tests/3891664/modules/ansible/steps/31)

## Test suite description
Like MicroOS, but use neither combustion nor ignition for the intial configuration, so jeos-firstboot runs.


## Reproducible

Fails since (at least) Build [20231129](https://openqa.opensuse.org/tests/3771195)


## Expected result

Last good: [20231127](https://openqa.opensuse.org/tests/3763820) (or more recent)


## Further details

Always latest result in this scenario: [latest](https://openqa.opensuse.org/tests/latest?arch=aarch64&distri=microos&flavor=MicroOS-Image-sdboot&machine=aarch64&test=microos-wizard&version=Tumbleweed)
Comment 1 Antonio Feijoo 2024-05-21 15:16:50 UTC
(In reply to Fabian Vogt from comment #0)
> +++ This bug was initially created as a clone of Bug #1219180 +++
> 
> Unless the initrd is built with the platform native DRM driver (e.g.
> virtio_gpu), there is no graphical output during boot. This also means that
> entering a passphrase for unlocking the root fs is not possible.

Thanks for finding this. It took me some time to reproduce it, because using Leap as host the boot hangs forever (I don't know since when, I have an old vm with snapshot 20231122 and it works), but I could reproduce it with Tumbleweed.

FTR:

> qemu-system-aarch64 -M virt -cpu cortex-a72 -m 2048 -serial stdio -device virtio-gpu-pci -device qemu-xhci -device usb-kbd -bios /usr/share/qemu/aavmf-aarch64-code.bin -drive if=none,file=$PWD/MicroOS.aarch64-16.0.0-kvm-and-xen-sdboot-Snapshot20240517.qcow2,id=hd0 -device virtio-blk-device,drive=hd0

(In reply to Fabian Vogt from comment #0)
> The original bug report has some more details, but the summary is that efifb
> or other builtin drivers don't work with virtio-gpu-pci. To get any display
> output at all, the virtio-gpu module is needed. From a dracut PoV, the crypt
> module should probably pull in drm if virtio-gpu is detected?

I would expand this requirement to when a dracut module needs systemd-ask-password, not only to the crypt module.

But, there is a problem with the second part. Nowadays, current upstream admins do not like to pull things based on which kernel modules are loaded on the running systemd. See:
- https://github.com/dracutdevs/dracut/pull/2412#discussion_r1239313722
- https://github.com/dracut-ng/dracut-ng/issues/236
Comment 2 Antonio Feijoo 2024-05-22 05:59:47 UTC
(In reply to Antonio Feijoo from comment #1)
> (In reply to Fabian Vogt from comment #0)
> But, there is a problem with the second part. Nowadays, current upstream
> admins do not like to pull things based on which kernel modules are loaded
> on the running systemd. See:
> - https://github.com/dracutdevs/dracut/pull/2412#discussion_r1239313722
> - https://github.com/dracut-ng/dracut-ng/issues/236

Related to this, would it make sense to always add drm in aarch64 unconditionally? Before joining SUSE I used to work with custom boards to be used within satellite transceivers without any display, the only way to interact with them was via serial port or network, so in that case including drm would be useless, but harmless I guess?
Comment 3 Fabian Vogt 2024-05-22 06:56:23 UTC
(In reply to Antonio Feijoo from comment #2)
> (In reply to Antonio Feijoo from comment #1)
> > (In reply to Fabian Vogt from comment #0)
> > But, there is a problem with the second part. Nowadays, current upstream
> > admins do not like to pull things based on which kernel modules are loaded
> > on the running systemd. See:
> > - https://github.com/dracutdevs/dracut/pull/2412#discussion_r1239313722
> > - https://github.com/dracut-ng/dracut-ng/issues/236

Yeah, reading /proc/modules is wrong. Making decisions based on plugged hardware through modalias would be fine though.

> Related to this, would it make sense to always add drm in aarch64
> unconditionally? Before joining SUSE I used to work with custom boards to be
> used within satellite transceivers without any display, the only way to
> interact with them was via serial port or network, so in that case including
> drm would be useless, but harmless I guess?

Depends on the size IMO. On x86 at least DRM pulls in a huge mess^W mass of kernel modules and firmware. nvidia alone needs ~60MiB...
Comment 4 Antonio Feijoo 2024-05-24 09:40:46 UTC
(In reply to Fabian Vogt from comment #3)
> Yeah, reading /proc/modules is wrong. Making decisions based on plugged
> hardware through modalias would be fine though.

Upstream PR: https://github.com/dracut-ng/dracut-ng/pull/317

It'd need to be adapted to the current version we have in our openSUSE fork, adding a similar check to the crypt module.
Comment 5 Antonio Feijoo 2024-05-27 07:05:08 UTC
(In reply to Antonio Feijoo from comment #4)
> Upstream PR: https://github.com/dracut-ng/dracut-ng/pull/317
> 
> It'd need to be adapted to the current version we have in our openSUSE fork,
> adding a similar check to the crypt module.

Already accepted. Would we need to backport this to another codestream other than Factory?
Comment 6 Fabian Vogt 2024-05-27 09:09:41 UTC
(In reply to Antonio Feijoo from comment #5)
> (In reply to Antonio Feijoo from comment #4)
> > Upstream PR: https://github.com/dracut-ng/dracut-ng/pull/317
> > 
> > It'd need to be adapted to the current version we have in our openSUSE fork,
> > adding a similar check to the crypt module.
> 
> Already accepted. Would we need to backport this to another codestream other
> than Factory?

From a QA perspective, no. From a user perspective in theory yes, but as there was so far noone affected I'd say TW is enough for now.
Comment 7 Antonio Feijoo 2024-06-04 07:12:40 UTC
(In reply to Fabian Vogt from comment #0)
> Always latest result in this scenario:
> [latest](https://openqa.opensuse.org/tests/latest?arch=aarch64&distri=microos&flavor=MicroOS-Image-sdboot&machine=aarch64&test=microos-wizard&version=Tumbleweed)

Fix included since snapshot 20240531, the openQA test no longer fails at ansible/steps/31.
Comment 8 Fabian Vogt 2024-06-04 07:25:32 UTC
(In reply to Antonio Feijoo from comment #7)
> (In reply to Fabian Vogt from comment #0)
> > Always latest result in this scenario:
> > [latest](https://openqa.opensuse.org/tests/latest?arch=aarch64&distri=microos&flavor=MicroOS-Image-sdboot&machine=aarch64&test=microos-wizard&version=Tumbleweed)
> 
> Fix included since snapshot 20240531, the openQA test no longer fails at
> ansible/steps/31.

\o/