Bug 1098009 - kernel-default-4.4.136-56.1 causes userspace #GP on IBRS systems
kernel-default-4.4.136-56.1 causes userspace #GP on IBRS systems
Status: RESOLVED FIXED
Classification: openSUSE
Product: openSUSE Distribution
Classification: openSUSE
Component: Kernel
Leap 42.3
x86-64 openSUSE 42.3
: P5 - None : Critical with 5 votes (vote)
: ---
Assigned To: E-mail List
E-mail List
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2018-06-18 09:26 UTC by David Kronlid
Modified: 2018-07-03 19:34 UTC (History)
10 users (show)

See Also:
Found By: Community User
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments
Error message first boot (5.48 MB, image/jpeg)
2018-06-18 09:26 UTC, David Kronlid
Details
Error message second boot (5.84 MB, image/jpeg)
2018-06-18 09:28 UTC, David Kronlid
Details
hwinfo (1.48 MB, text/plain)
2018-06-18 10:40 UTC, David Kronlid
Details
Skylake KVM virtual machine (15.53 KB, image/png)
2018-06-18 10:48 UTC, David Kronlid
Details
full bool log from serial console (6.58 KB, application/x-xz)
2018-06-18 11:57 UTC, Ruediger Meier
Details
The kernel from http://beta.suse.com/private/jkosina/bsc1098009/ caused another error on physical machine (4.24 MB, image/jpeg)
2018-06-18 13:49 UTC, David Kronlid
Details

Note You need to log in before you can comment on or make changes to this bug.
Description David Kronlid 2018-06-18 09:26:08 UTC
Created attachment 774323 [details]
Error message first boot

Patch openSUSE-2018-650 with kernel-default-4.4.136-56.1 results in an unusable system.

See picture of message during reboot.

Computer has:
CPU: https://ark.intel.com/products/88196/Intel-Core-i7-6700-Processor-8M-Cache-up-to-4_00-GHz
MB: https://msi.com/Motherboard/Z170A-GAMING-M5/Specification (with latest BIOS/UEFI)

Luckily grub2 still works and snapper rollback made it possible to revert to a previous state before patch.
Comment 1 David Kronlid 2018-06-18 09:28:35 UTC
Created attachment 774324 [details]
Error message second boot
Comment 2 Andreas Stieger 2018-06-18 09:37:30 UTC
(In reply to David Kronlid from comment #0)
> Luckily grub2 still works and snapper rollback made it possible to revert to
> a previous state before patch.

As the previous kernels are usually retained (multiversion), you could have just booted that.
Comment 3 Takashi Iwai 2018-06-18 10:02:35 UTC
Could you give hwinfo output running on the good kernel (4.4.132-*)?  Please attach the output to Bugzilla.

This doesn't seem appearing on every system, so it must be specific to the hardware.
Comment 4 David Kronlid 2018-06-18 10:31:21 UTC
I can confirm it is related to the hardware, it seems to be related to the CPU.

I tried running OpenSUSE 42.3 in a KVM virtual machine.
If I set the CPU to KVM64 or core2duo it works fine.
If I set it to use the host CPU (Skylake-Client-IBRS) it gives the same error message about "DWARF2 unwinder stuck"
Comment 5 David Kronlid 2018-06-18 10:40:12 UTC
Created attachment 774335 [details]
hwinfo
Comment 6 Takashi Iwai 2018-06-18 10:42:21 UTC
Thanks, it'a a damn good information!

Boris, Jiri, something screwed up with the recent update.
Comment 7 David Kronlid 2018-06-18 10:48:18 UTC
Created attachment 774338 [details]
Skylake KVM virtual machine
Comment 9 Marcus Meissner 2018-06-18 11:13:07 UTC
I removed the update from the channel again.
Comment 10 Ruediger Meier 2018-06-18 11:53:48 UTC
I have the same problem. The vanilla kernel (4.4.136-56.1.x86_64) seems to work.
Comment 11 Ruediger Meier 2018-06-18 11:57:48 UTC
Created attachment 774346 [details]
full bool log from serial console
Comment 12 Jiri Kosina 2018-06-18 12:41:47 UTC
{ENABLE|DISABLE}_IBRS is clobbering %rax while it should not. I am building a testing kernel with a candidate fix.
Comment 13 Jiri Kosina 2018-06-18 12:47:26 UTC
David, Ariel, Ruediger, could you please test with kernel (*) from

   http://beta.suse.com/private/jkosina/bsc1098009/

and report back?

(*) please understand that it's my private testing build, not an officially SUSE
    released kernel yet

Thanks!
Comment 14 Takashi Iwai 2018-06-18 13:11:08 UTC
(In reply to Jiri Kosina from comment #13)
> David, Ariel, Ruediger, could you please test with kernel (*) from
> 
>    http://beta.suse.com/private/jkosina/bsc1098009/
> 
> and report back?
> 
> (*) please understand that it's my private testing build, not an officially
> SUSE
>     released kernel yet

Also note that this is SLE12-SP3 kernel, not Leap 42.3, so it has a bit different kernel config.  But it should suffice only for checking kernel-default.rpm, just to see whether it boots well or not.

I tested the test kernel on a KVM with -cpu Skylake-Client-IBRS, and it looks promising.  The boot is OK and X comes up.
Comment 15 David Kronlid 2018-06-18 13:19:01 UTC
(In reply to Jiri Kosina from comment #13)
> David, Ariel, Ruediger, could you please test with kernel (*) from
> 
>    http://beta.suse.com/private/jkosina/bsc1098009/
> 
> and report back?
> 
> (*) please understand that it's my private testing build, not an officially
> SUSE
>     released kernel yet
> 
> Thanks!

I tested http://beta.suse.com/private/jkosina/bsc1098009/kernel-default-4.4.138-ibrs.off.lfence.revert.x86_64.rpm
and on Skylake-Client-IBRS and Skylake-Client it works fine in virtual machines using qemu-kvm, no error message like before.
Comment 16 Ruediger Meier 2018-06-18 13:28:07 UTC
(In reply to Jiri Kosina from comment #13)
> David, Ariel, Ruediger, could you please test with kernel (*) from

No more panic on two tested machines:
  - 4.4.138-ibrs.off.lfence.revert-default  
  - Fujitsu D34XX / i5-6500,i7-6700
Comment 17 Ruediger Meier 2018-06-18 13:30:02 UTC
In case somebody missed that ... the panic also happened on real hardware without KVM.
Comment 18 Jiri Kosina 2018-06-18 13:31:38 UTC
Thanks a lot for prompt testing.

Pushed out as baa07f9df9.
Comment 19 David Kronlid 2018-06-18 13:49:19 UTC
Created attachment 774357 [details]
The kernel from http://beta.suse.com/private/jkosina/bsc1098009/ caused another error on physical machine

I tested the "beta kernel" on my physical machine with skylake cpu and it caused another error message, see picture.

Hopefully this is only because it's a kernel based on SLE 12 SP3 running on Leap 42.3
Comment 20 Takashi Iwai 2018-06-18 13:56:46 UTC
(In reply to David Kronlid from comment #19)
> I tested the "beta kernel" on my physical machine with skylake cpu and it
> caused another error message, see picture.

Are you using secure boot?

In anyway, the Leap 42.3 test kernel is being built on OBS Kernel:openSUSE-42.3:Submit repo.  The build should finish in an hour or so.

If you're not using secure boot, this kernel should work as is.  If not, please report it back ASAP.  Thanks.
Comment 21 David Kronlid 2018-06-18 14:09:28 UTC
Yes, I'm using secure boot!

Plus that machine has a NVME drive with the UEFI files on it, but booting from another SATA SSD so there's several non-standard configurations. It's my gaming machine which I also use as a test machine before I start applying patches and updates or distro upgrades to my servers. I also test on a virtual machine running on a machine with AMD CPU before I start updating all machines.

Testing on unimportant machines before applying things on important machines who need to run 24/7 has helped me many times during the years. This procedure is especially important to do on Ubuntu servers as they don't have "snapper rollback", but I do it also for OpenSUSE machines even if they have "snapper rollback".
Comment 22 Takashi Iwai 2018-06-18 14:14:50 UTC
(In reply to David Kronlid from comment #21)
> Yes, I'm using secure boot!

OK, then it's a bit tricky to make the test kernel working on it.

I'm going to check the currently built update kernel (in OBS Kernel:openSUSE-42.3:Submit) and submit it for the next update kernel in today.

Please check later the kernel in
  http://download.opensuse.org/update/leap/42.3-test/
later once when the new kernel appears.
Comment 24 Swamp Workflow Management 2018-06-18 16:50:34 UTC
This is an autogenerated message for OBS integration:
This bug (1098009) was mentioned in
https://build.opensuse.org/request/show/617550 42.3 / kernel-source
Comment 26 David Kronlid 2018-06-19 20:57:41 UTC
I tried the kernel from http://download.opensuse.org/update/leap/42.3-test/x86_64/kernel-default-4.4.138-59.1.x86_64.rpm

It works fine both on my physical Skylake CPU i7-6700 and on the emulated Skylake-Client-IBRS CPU in QEMU-KVM.

Thanks for fixing this quickly!
Comment 27 Ariel Machado 2018-06-20 09:01:57 UTC
(In reply to Jiri Kosina from comment #13)
> David, Ariel, Ruediger, could you please test with kernel (*) from

I confirm that it runs fine on my physical machine i7-6700T.
Tested kernel from http://download.opensuse.org/update/leap/42.3-test/x86_64/kernel-default-4.4.138-59.1.x86_64.rpm
(4.4.138-59-default #1 SMP Mon Jun 18 13:48:42 UTC 2018 (f0b8f6b) x86_64 x86_64 x86_64 GNU/Linux)
Comment 28 Takashi Iwai 2018-06-20 21:00:12 UTC
The fix went out and confirmed.  Let's close.
Comment 29 Swamp Workflow Management 2018-06-21 16:18:14 UTC
SUSE-SU-2018:1772-1: An update that solves 6 vulnerabilities and has 47 fixes is now available.

Category: security (important)
Bug References: 1012382,1024718,1031717,1035432,1041740,1045330,1056415,1066223,1068032,1068054,1068951,1070404,1073311,1075428,1076049,1078583,1079152,1080542,1080656,1081500,1081514,1082153,1082504,1082979,1085185,1085308,1086400,1086716,1087036,1087086,1088871,1090435,1090534,1090734,1090955,1091594,1094532,1095042,1095147,1096037,1096140,1096214,1096242,1096281,1096751,1096982,1097234,1097356,1098009,1098012,971975,973378,978907
CVE References: CVE-2017-17741,CVE-2017-18241,CVE-2017-18249,CVE-2018-12233,CVE-2018-3665,CVE-2018-5848
Sources used:
SUSE Linux Enterprise Workstation Extension 12-SP3 (src):    kernel-default-4.4.138-94.39.1
SUSE Linux Enterprise Software Development Kit 12-SP3 (src):    kernel-docs-4.4.138-94.39.1, kernel-obs-build-4.4.138-94.39.1
SUSE Linux Enterprise Server 12-SP3 (src):    kernel-default-4.4.138-94.39.1, kernel-source-4.4.138-94.39.1, kernel-syms-4.4.138-94.39.1
SUSE Linux Enterprise Live Patching 12-SP3 (src):    kgraft-patch-SLE12-SP3_Update_14-1-4.5.1
SUSE Linux Enterprise High Availability 12-SP3 (src):    kernel-default-4.4.138-94.39.1
SUSE Linux Enterprise Desktop 12-SP3 (src):    kernel-default-4.4.138-94.39.1, kernel-source-4.4.138-94.39.1, kernel-syms-4.4.138-94.39.1
SUSE CaaS Platform ALL (src):    kernel-default-4.4.138-94.39.1
Comment 30 Swamp Workflow Management 2018-06-21 16:30:44 UTC
openSUSE-SU-2018:1773-1: An update that solves 11 vulnerabilities and has 66 fixes is now available.

Category: security (important)
Bug References: 1012382,1019695,1019699,1022604,1022607,1022743,1024718,1031492,1031717,1035432,1036215,1041740,1045330,1056415,1066223,1068032,1068054,1068951,1070404,1073311,1075428,1076049,1078583,1079152,1080542,1080656,1081500,1081514,1082153,1082504,1082979,1085308,1086400,1086716,1087007,1087012,1087036,1087082,1087086,1087095,1088871,1090435,1090534,1090734,1090955,1091594,1091815,1092552,1092813,1092903,1093533,1093904,1094177,1094268,1094353,1094356,1094405,1094466,1094532,1094823,1094840,1095042,1095147,1096037,1096140,1096214,1096242,1096281,1096751,1096982,1097234,1097356,1098009,1098012,971975,973378,978907
CVE References: CVE-2017-13305,CVE-2017-17741,CVE-2017-18241,CVE-2017-18249,CVE-2018-1092,CVE-2018-1093,CVE-2018-1094,CVE-2018-12233,CVE-2018-3639,CVE-2018-3665,CVE-2018-5848
Sources used:
openSUSE Leap 42.3 (src):    kernel-debug-4.4.138-59.1, kernel-default-4.4.138-59.1, kernel-docs-4.4.138-59.1, kernel-obs-build-4.4.138-59.1, kernel-obs-qa-4.4.138-59.1, kernel-source-4.4.138-59.1, kernel-syms-4.4.138-59.1, kernel-vanilla-4.4.138-59.1
Comment 33 Swamp Workflow Management 2018-06-26 16:30:32 UTC
SUSE-SU-2018:1816-1: An update that solves 17 vulnerabilities and has 109 fixes is now available.

Category: security (important)
Bug References: 1009062,1012382,1019695,1019699,1022604,1022607,1022743,1024718,1031717,1035432,1036215,1041740,1043598,1044596,1045330,1056415,1056427,1060799,1066223,1068032,1068054,1068951,1070404,1073059,1073311,1075087,1075428,1076049,1076263,1076805,1078583,1079152,1080157,1080542,1080656,1081500,1081514,1081599,1082153,1082299,1082485,1082504,1082962,1082979,1083635,1083650,1083900,1084721,1085185,1085308,1086400,1086716,1087007,1087012,1087036,1087082,1087086,1087095,1088810,1088871,1089023,1089115,1089393,1089895,1090225,1090435,1090534,1090643,1090658,1090663,1090708,1090718,1090734,1090953,1090955,1091041,1091325,1091594,1091728,1091960,1092289,1092497,1092552,1092566,1092772,1092813,1092888,1092904,1092975,1093008,1093035,1093144,1093215,1093533,1093904,1093990,1094019,1094033,1094059,1094177,1094268,1094353,1094356,1094405,1094466,1094532,1094823,1094840,1095042,1095147,1096037,1096140,1096214,1096242,1096281,1096751,1096982,1097234,1097356,1098009,1098012,919144,971975,973378,978907,993388
CVE References: CVE-2017-13305,CVE-2017-17741,CVE-2017-18241,CVE-2017-18249,CVE-2018-1000199,CVE-2018-1065,CVE-2018-1092,CVE-2018-1093,CVE-2018-1094,CVE-2018-1130,CVE-2018-12233,CVE-2018-3639,CVE-2018-3665,CVE-2018-5803,CVE-2018-5848,CVE-2018-7492,CVE-2018-8781
Sources used:
SUSE Linux Enterprise Real Time Extension 12-SP3 (src):    kernel-rt-4.4.138-3.14.1, kernel-rt_debug-4.4.138-3.14.1, kernel-source-rt-4.4.138-3.14.1, kernel-syms-rt-4.4.138-3.14.1