Bug 1203641 - [sle15sp5][Build 21.1]openQA test fails in first_boot, os fails to boot after installation on hyperv-2019 UEFI setup
[sle15sp5][Build 21.1]openQA test fails in first_boot, os fails to boot after...
Status: NEW
Classification: openSUSE
Product: PUBLIC SUSE Linux Enterprise Server 15 SP5
Classification: openSUSE
Component: Kernel
unspecified
x86-64 SLES 15
: P2 - High : Normal
: ---
Assigned To: Kernel Bugs
https://openqa.suse.de/tests/9525060/...
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2022-09-22 02:52 UTC by Richard Fan
Modified: 2022-11-22 03:48 UTC (History)
5 users (show)

See Also:
Found By: openQA
Services Priority:
Business Priority:
Blocker: Yes
Marketing QA Status: ---
IT Deployment: ---
mfilka: needinfo? (kernel-bugs)


Attachments
y2logs for hyperv (11.31 MB, application/gzip)
2022-09-28 05:36 UTC, Richard Fan
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Richard Fan 2022-09-22 02:52:17 UTC
## Observation

openQA test in scenario sle-15-SP5-Online-x86_64-default@svirt-hyperv-uefi fails in
[first_boot](https://openqa.suse.de/tests/9525060/modules/first_boot/steps/4)

## Test suite description
Maintainer: QE Core

The standard scenario where we mainly just follow installation suggestions without any adjustments.


## Reproducible

Fails since (at least) Build [21.1](https://openqa.suse.de/tests/9518301)


## Expected result

Last good: [19.1](https://openqa.suse.de/tests/9431186) (or more recent)


## Further details

Always latest result in this scenario: [latest](https://openqa.suse.de/tests/latest?arch=x86_64&distri=sle&flavor=Online&machine=svirt-hyperv-uefi&test=default&version=15-SP5)
Comment 1 Richard Fan 2022-09-22 02:54:16 UTC
Here comes the serial console logs:
 https://openqa.suse.de/tests/9525060/logfile?filename=serial0.txt

And from https://openqa.suse.de/tests/9525060#step/first_boot/1, we can see it enters into emergency mode.
Comment 2 Stefan Hundhammer 2022-09-26 08:24:04 UTC
In the video, from 3:18 on the console says:

  You are in emergency mode. After logging in, type 
  "journalctl -xb" to view system logs, "systemctl reboot"
  to reboot, "systemctl default" or "exit" to boot into
  default mode.

  Give root password for maintenance
  (or press Control-D to continue):


This isjust after a (crude) Plymouth screen with 3 green dots, the last of them highlighted.

IIRC that means that the kernel did boot, but systemd got stuck during initializing the newly booted system.
Comment 3 Stefan Hundhammer 2022-09-26 08:32:06 UTC
I also see some dozen of those messages in serial0.txt (however significant they may be):

  Lockdown: init: /dev/mem,kmem,port is restricted;
  see man kernel_lockdown.7

Unfortunately, there are no y2logs yet at this stage, so I guess you will have to do some manual debugging to find out what exactly went wrong.

Maybe more logs like the y2logs or the journal can be salvaged by logging in in that systemd emergency mode. At least finding out what filesystems are already mounted will probably be helpful ("mount", "df", "lsblk" or whatever is actually available in that state).
Comment 4 Richard Fan 2022-09-26 09:04:12 UTC
(In reply to Stefan Hundhammer from comment #3)
> I also see some dozen of those messages in serial0.txt (however significant
> they may be):
> 
>   Lockdown: init: /dev/mem,kmem,port is restricted;
>   see man kernel_lockdown.7
> 
> Unfortunately, there are no y2logs yet at this stage, so I guess you will
> have to do some manual debugging to find out what exactly went wrong.
> 
> Maybe more logs like the y2logs or the journal can be salvaged by logging in
> in that systemd emergency mode. At least finding out what filesystems are
> already mounted will probably be helpful ("mount", "df", "lsblk" or whatever
> is actually available in that state).

I will try to collect the y2logs
Comment 5 Richard Fan 2022-09-26 12:00:57 UTC
The issue was gone when I tried to re-run the test
https://openqa.suse.de/tests/9602611#

Let me run more times to see if it can be reproduced stably.
# for i in {1..5};do openqa-clone-job  --from http://openqa.suse.de --host http://openqa.suse.de 9602611 _GROUP_ID=0 -skip-download --skip-chained-deps BUILD=rfan1_uefi_hyperv TEST=rfan1_uefi_hyperv RETRY=0;done
Created job #9603519: sle-15-SP5-Online-x86_64-Build24.1-default@svirt-hyperv-uefi -> http://openqa.suse.de/t9603519
Created job #9603520: sle-15-SP5-Online-x86_64-Build24.1-default@svirt-hyperv-uefi -> http://openqa.suse.de/t9603520
Created job #9603521: sle-15-SP5-Online-x86_64-Build24.1-default@svirt-hyperv-uefi -> http://openqa.suse.de/t9603521
Created job #9603522: sle-15-SP5-Online-x86_64-Build24.1-default@svirt-hyperv-uefi -> http://openqa.suse.de/t9603522
Created job #9603523: sle-15-SP5-Online-x86_64-Build24.1-default@svirt-hyperv-uefi -> http://openqa.suse.de/t9603523
Comment 6 Richard Fan 2022-09-28 05:35:43 UTC
https://openqa.suse.de/tests/9621050#

I can see some systemd service failed there.

https://openqa.suse.de/tests/9621050#step/first_boot/10
Comment 7 Richard Fan 2022-09-28 05:36:15 UTC
Created attachment 861792 [details]
y2logs for hyperv
Comment 8 Richard Fan 2022-09-28 05:37:54 UTC
BTW, not sure if the issue has something to do with  
https://bugzilla.suse.com/show_bug.cgi?id=1202731
Comment 9 Michal Filka 2022-09-29 06:41:57 UTC
@kernel team:
There is number of 

Lockdown: init: /dev/mem,kmem,port is restricted; see man kernel_lockdown.7

recods at the end of serial0.txt. Can it somehow be a consequence of an issue in the bootstrap? Can we find what it is caused by?