Bug 1173116

Summary: zypper dup from 15.1 doesn't boot
Product: [openSUSE] openSUSE Distribution Reporter: Ludwig Nussel <lnussel>
Component: KernelAssignee: openSUSE Kernel Bugs <kernel-bugs>
Status: RESOLVED FIXED QA Contact: E-mail List <qa-bugs>
Severity: Normal    
Priority: P3 - Medium CC: lubos.kocman, tiwai
Version: Leap 15.2   
Target Milestone: ---   
Hardware: Other   
OS: Other   
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---

Description Ludwig Nussel 2020-06-18 12:31:14 UTC
15.1 system (came from 42.3->15.0->15.1) with cryptlvm, separate /boot on ext4.
First boot after zypper dup doesn't boot. System freezes (switching numlock has no effect, ctrl-alt-del doesn't do anyhting) after it shows that initrd gets loaded.
Forced power cycle, removed "splash=silent quiet". System booted just fine, prompted for passphrase in text mode.

Ran mkinitrd and reboot, all fine now. No lockup, plymouth prompt in graphical mode. wtf?
Comment 1 Ludwig Nussel 2020-06-19 08:43:12 UTC
This is very strange. This seems to only happen in a warm start. Tried to add earlyprintk=efi but as soon as I replace quiet with debug the system boots fine. Turned off grub menu in favor of countdown, then the last thing on screen is "EFI stub: UEFI Secure Boot is enabled" printed over the message of initrd loading.
Comment 2 Lubos Kocman 2020-06-25 12:37:59 UTC
Any proposal for the assignee Ludwig? Seems like issue in between kernel/bootloader. Let's start with kernel since complete freeze is involved.
Comment 3 Takashi Iwai 2020-06-25 13:21:30 UTC
Is your machine a Skylake CPU?  It might be some side effect of the buggy Intel ucode.  The update for certain CPU models brought such problems.

The correction (revert of ucode updates for those CPUs) seem still on its way.
Comment 4 Ludwig Nussel 2020-06-25 13:43:47 UTC
# lscpu
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
Address sizes:       39 bits physical, 48 bits virtual
CPU(s):              4
On-line CPU(s) list: 0-3
Thread(s) per core:  2
Core(s) per socket:  2
Socket(s):           1
NUMA node(s):        1
Vendor ID:           GenuineIntel
CPU family:          6
Model:               78
Model name:          Intel(R) Core(TM) i7-6600U CPU @ 2.60GHz
Stepping:            3
CPU MHz:             771.700
CPU max MHz:         3400.0000
CPU min MHz:         400.0000
BogoMIPS:            5599.85
Virtualization:      VT-x
L1d cache:           32K
L1i cache:           32K
L2 cache:            256K
L3 cache:            4096K
NUMA node0 CPU(s):   0-3
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp md_clear flush_l1d
Comment 7 Takashi Iwai 2020-06-25 14:35:54 UTC
(In reply to Ludwig Nussel from comment #4)
> CPU family:          6
> Model:               78
> Model name:          Intel(R) Core(TM) i7-6600U CPU @ 2.60GHz
> Stepping:            3

This is indeed the CPU model hitting a buggy firmware.
Try to downgrade the ucode-intel to 20191115 of Leap 15.1 until the fixed (reverted) package is released for Leap 15.2

If you still see the same problem even after the ucode revert, please reopen.
Comment 8 Ludwig Nussel 2020-06-25 15:15:00 UTC
I grabbed ucode-intel-20200616-3.3.1.x86_64 from IBS. Seems to work fine, thanks!