Bugzilla – Bug 1006417
System doesn't boot with new kernel 4.8.3 (32-bit)
Last modified: 2016-11-09 10:51:25 UTC
After my "zypper dup" and update kernel to 4.8.3-1 version on my three i686 computers (different hardware) those computers don't boot. Computers still restarted again and again. The last message on screen is "Loading initial ramdisk ..."after that computers reboots itself. Kernels version are 2x pae and 1x default.
Does the issue happen even if you blacklist nouveau as in bug 1006420?
Yes,blacklisting of nouveau doesn't influence on this bug. I have this (1006417)bug on my i686 computers only. (Bug 1006420 is on my 64-bit computer only)
I can confirm this on a Pentium III (Katmai). Kernel 4.7.6-1 is booting without problem.
Could you guys remove "quiet" kernel option in grub and add "debug" instead? Where approximatelly it reboots? You can also add "boot_delay=100" (or =10) to slow down the output.
It's still the same. First message "Loading Linux 4.8.3-1-pae" Second message "Loading initial ramdisk ..." and immediately after that computer reboots itself
The same here, now with kernel 4.8.4 on i686 (Pentium III Katmai) and kernel 4.8.3 on i586 Pentium-S. The debug and boot_delay parameters don't help. Immediately after "Loading initial ramdisk ..." the system is rebooting without any other messages/output.
Maybe it's an illegal instruction in an early phase. Pentium III (Katmai) doesn't have "sse2". Pentium-S doesn't have "mmx", "sse" or "sse2".
(In reply to Ralph Gauer from comment #7) > Maybe it's an illegal instruction in an early phase. > Pentium III (Katmai) doesn't have "sse2". > Pentium-S doesn't have "mmx", "sse" or "sse2". I'm having this problem in a 32bit Tumbleweed VM (in VirtualBox) too, the host is an AMD Athlon64 which definitely has "mmx", "sse" and "sse2"... The 4.7.6 kernel boots fine here as well.
*** Bug 1004949 has been marked as a duplicate of this bug. ***
I checked the installation on KVM, and the latest TW 32bit image could be installed / run fine. So it's likely depending on BIOS / CPU / whatever. Boris, do you know of any change in x86 code that may affect 32bit boot?
Just to be sure: did anyone try "dis_ucode_ldr" boot option?
(In reply to Takashi Iwai from comment #11) > Just to be sure: did anyone try "dis_ucode_ldr" boot option? Yes, that was the first thing I tried. But it didn't have any (positive) effect. I tried it again to be sure (I could have made a typo), and no, it doesn't help here.
(In reply to Takashi Iwai from comment #10) > Boris, do you know of any change in x86 code that may affect 32bit boot? Hmm, nothing rings a bell. The only thing I can think of is bisection. Maybe people can try 4.8.1, 4.8.2 and this way gradually narrow it down. Also, can people upload dmesg from a working kernel? Thanks.
Created attachment 699697 [details] dmesg of working kernel 4.7.6-1-default on Pentium III (Katmai)
Created attachment 699699 [details] dmesg of working kernel 4.7.6-1-default on Pentium-S
Created attachment 699700 [details] dmesg of working kernel 4.7.6-1-default on Athlon64 inside Virtualbox I tried booting kernel-vanilla-4.8.4 from the Tumbleweed repo, with the same problem. I also tried booting *without* an initrd (by removing the initrd line from the boot menu entry), and it behaved the same. So I suppose we can rule out a problem with the initrd. A dmesg from me too is attached, kernel 4.7.6-1-default inside VirtualBox with an Athlon64 3000+ on the host side.
(In reply to Takashi Iwai from comment #9) > *** Bug 1004949 has been marked as a duplicate of this bug. *** Kernel 4.8.4 32-bit with Tumbleweed in a Virtualbox VM does not boot, even if the IDE controller is removed and only the SATA controller with the hard drive is left.
Hm. The latest 32bit Krypton LiveCD (based on Tumbleweed) with kernel 4.8.4-pae boots fine on this host and also as guest in vmware, but "crashes" (i.e. reboots immediately after the kernel/inintrd is loaded) when running inside VirtualBox (all on the same host). So it may indeed be BIOS specific indeed or certain hardware other than the CPU...
Can you reproduce with qemu/kvm instead?
(In reply to Wolfgang Bauer from comment #18) > but "crashes" (i.e. > reboots immediately after the kernel/inintrd is loaded) when running inside > VirtualBox (all on the same host). I noticed that if I enable IO-APIC in the VM settings (under "System") the 4.8.4 kernel works, if it's disabled it immediately reboots after the kernel is loaded. Maybe this helps in finding the problem? Also I noticed when booting the LiveCD that this message shortly is displayed after the kernel is loaded, before the system reboots: "Probing EDD (edd=off to disable)... OK" (I don't see that message when booting the Tumbleweed installation) edd=off doesn't help though (it only makes this message disappear).
(In reply to Borislav Petkov from comment #19) > Can you reproduce with qemu/kvm instead? Sorry, I can't try that. My CPU has no virtualization support...
(In reply to Wolfgang Bauer from comment #21) > My CPU has no virtualization support... You don't absolutely need hw virtualization support to install a guest in qemu.
(In reply to Wolfgang Bauer from comment #20) > I noticed that if I enable IO-APIC in the VM settings (under "System") the > 4.8.4 kernel works, if it's disabled it immediately reboots after the kernel > is loaded. > Maybe this helps in finding the problem? Yap, that rings a bell. I'm willing to put some money on this: ff8560512b8d ("x86/boot/smp: Don't try to poke disabled/non-existent APIC") which is already queud for stable. Wanna apply it ontop of your kernel, rebuild and retest? I can help out along the way if you'd like :)
(In reply to Borislav Petkov from comment #23) > Yap, that rings a bell. I'm willing to put some money on this: > > ff8560512b8d ("x86/boot/smp: Don't try to poke disabled/non-existent APIC") > > which is already queud for stable. > > Wanna apply it ontop of your kernel, rebuild and retest? This one I suppose: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=ff8560512b8d4b7ca3ef4fd69166634ac30b2525 I will try it out and report back.
(In reply to Wolfgang Bauer from comment #24) > This one I suppose: > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/ > ?id=ff8560512b8d4b7ca3ef4fd69166634ac30b2525 > > I will try it out and report back. Exactly and thanks!
(In reply to Wolfgang Bauer from comment #24) > This one I suppose: > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/ > ?id=ff8560512b8d4b7ca3ef4fd69166634ac30b2525 > > I will try it out and report back. Unfortunately this doesn't help with VirtualBox. Here are my packages, if somebody wants to try that patch on real hardware: http://download.opensuse.org/repositories/home:/wolfi323:/branches:/Kernel:/stable/standard/i586/
Yeah, this got reported on lkml too and we're debugging: https://lkml.kernel.org/r/ca86ac5f-ce41-4773-98c5-9f55bac43503@default I'll let you know once a fix is found. Thanks for giving it a try anyway.
(In reply to Wolfgang Bauer from comment #26) > Here are my packages, if somebody wants to try that patch on real hardware: > http://download.opensuse.org/repositories/home:/wolfi323:/branches:/Kernel:/ > stable/standard/i586/ Yes, other people on the bug with real hardware, please try this package. Thanks.
I'm trying it right now (old computer processor VIA CN700). And it's fixed. The computer is booting.
Good, btw, this fix 85533a1ae7b6 ("x86/boot/smp: Don't try to poke disabled/non-existent APIC") is in 4.8.5 which you can get from here: http://kernel.opensuse.org/packages/stable HTH.
(In reply to Borislav Petkov from comment #27) > Yeah, this got reported on lkml too and we're debugging: > > https://lkml.kernel.org/r/ca86ac5f-ce41-4773-98c5-9f55bac43503@default > > I'll let you know once a fix is found. Thank you. But it's not a problem for me anyway, especially as I have a workaround now (enabling IO-APIC). ;-) I will still give the patch posted that mailinglist thread a try though: https://lkml.org/lkml/2016/10/28/581
(In reply to Wolfgang Bauer from comment #31) > I will still give the patch posted that mailinglist thread a try though: > https://lkml.org/lkml/2016/10/28/581 Sure, and please do report whether it worked or not... Thanks.
Yes, it works. I applied the patch to Kernel 4.8.5 from Kernel:stable, and the system successfully boots now regardless whether IO-APIC is enabled or not.
Ok, fix is queued: http://git.kernel.org/tip/1e90a13d0c3dc94512af1ccb2b6563e8297838fa Closing.
*** Bug 1007746 has been marked as a duplicate of this bug. ***
Is there an Tumbleweed iso file to download for an i586 CPU, with a linux kernel which contains the fix of comment #34 ? I think, openSUSE-Tumbleweed-DVD-i586-Snapshot20161031-Media.iso does not contain this fix.
(In reply to Egon Niessner from comment #36) > Is there an Tumbleweed iso file to download for an i586 CPU, Is it really a 32-bit-only CPU or you could theoretically upgrade to 64-bit?
(In reply to Borislav Petkov from comment #37) > (In reply to Egon Niessner from comment #36) > > Is there an Tumbleweed iso file to download for an i586 CPU, > > Is it really a 32-bit-only CPU or you could theoretically upgrade to > 64-bit? Yes, it is a 32 bit AMD Athlon(tm) CPU. Regards Egon
Ok, so I'm being told 4.8.6 is in the queue which contains the fix. I'd keep checking the Changes* files here http://download.opensuse.org/tumbleweed/iso/ for the new kernel version to appear. HTH.
*** Bug 1006632 has been marked as a duplicate of this bug. ***
FYI, kernel 4.8.6 is included in today's new Tumbleweed snapshot, so it should be on the latest TW iso files too. And I can confirm that it boots successfully here (inside VirtualBox).
I tried to boot with a DVD containing openSUSE-Tumbleweed-DVD-i586-Snapshot20161105-Media.iso (it contains kernel-default-4.8.6-2.1.i586) During boot of the rescue system in save mode (and all selectable kernel settings before booting), the rescue system crashes after loading the initrd with the last message after backtrace "DWARF2 unwinder stuck at resume_userspace + 0xe/0x13 leftover inexact backtrace" So tumbleweed is not bootable on a real i586 single core system. (On a pentium 4 system with two 32 bit cores, this dvd can be booted.)
(In reply to Egon Niessner from comment #42) > During boot of the rescue system in save mode (and all selectable kernel > settings > before booting), > the rescue system crashes after loading the initrd > with the last message after backtrace > "DWARF2 unwinder stuck at resume_userspace + 0xe/0x13 > leftover inexact backtrace" I suppose you should file a new bug report about this though. The original problem was different (crash immediately after "Loading initial ramdisk..." without any further messages, and has been confirmed as fixed by the reporter too. Btw, it's not a general problem with single core systems. Mine is single core too (though 64bit actually), and the reporter's probably as well (I don't think a VIA CN700 is dual core)... ;-)
I created a new bug-report 1009246 for this problem. Regards Egon