Bugzilla – Bug 908163
Start job for kernel modules stays forever on packaged 3.17.2, 3.17.4 (git is fine)
Last modified: 2018-07-03 20:52:42 UTC
I have upgraded all packages from openSUSE Factory today (2014-12-03) open. In the update there's the kernel-desktop: i | kernel-desktop | package | 3.17.2-1.2 | x86_64 | repo-oss When I boot I get a message like "Start job Kernel Module Loading" that stays forever. I thought I'd try another kernel, so I used 3.17.4 from the stable OBS repo: i | kernel-desktop | package | 3.17.4-2.1.g2d23787 | x86_64 | Kernel:stable Same problem. I tried compiling 3.17.4 from kernel.org's linux-stable sources and used the config from SUSE's /boot. Surprinsingly ths kernel does NOT have the hanging problem. So I suspect that something special is happening in the packaged version of 3.17.2 (or .4) which doesn't happen when compiling the kernel from git. There are also suspicious messages in the log, I'll attach the log messages for both the packaged version and the other one.
Created attachment 615740 [details] Log from the packaged 3.17.4
Created attachment 615741 [details] Log from the kernel compiled from git (tag 3.17.4)
My hardware is the Dell XPS 13 9333 (Developer edition, which came preloaded with Ubuntu). Please note that the stock kernel 3.17.1 was working fine before on this hardware, so this is a regression. Also please disregard the "mei" related debug messages which I enabled to debug other issues. Disabling these debug messages didn't solve the issue.
seeing iwlwifi makes me think of bug 891645 but then it would continue after 2 minutes. Do you load certain kernel modules on boot? Can you try to boot with more debug output e.g. splash=verbose omit the "quiet" flag and use http://freedesktop.org/wiki/Software/systemd/Debugging/ maybe one of the modules is waiting for something (firmware, hardware) to appear
Created attachment 615820 [details] verbose dmesg from systemd I don't explicitly load any special modules on boot, but I could imagine that some modules would load for the stock kernel but not the other one (I had that with vboxdrv). When testing boot I also uninstalled all packages that have "kmp" in their names. Waiting 10 minutes at boot doesn't auto-kill the stalled waiting systemd-modules-load.service. Thanks to your link I could get a shell and have attached the dmesg from that boot. Unfortunately I didn't find any clues in it as to what kernel module would be causing trouble.
I diffed lsmod between stock and git versions and found out that I had blacklisted i915 due to some tests related to suspend. So now I un-blacklisted i915 but the problem persists. Tomorrow I'll redo the systemd verbose log thing for 3.17.4 from kernel-stable and also for 3.17.4 from git, then maybe a diff will help reveal what part of their difference is causing the stalling service.
When you enable/disable blacklisting i915, you have to recreate initrd, too. Run mkinitrd once and retry. I see HD-audio driver complains the lack of i915 symbols for HDMI audio, but I'm not sure whether this can be the cause.
Seems that a crypt module triggers a kernel oops: -->-- [ 20.118036] systemd[442]: Executing: /usr/lib/systemd/systemd-modules-load [ 20.120079] BUG: unable to handle kernel paging request at ffffffffa001b000 [ 20.120084] IP: [<ffffffff812a3188>] register_key_type+0x48/0xb0 [ 20.120086] PGD 1e17067 PUD 1e18063 PMD bd6f3067 PTE 0 [ 20.120088] Oops: 0000 [#1] PREEMPT SMP [ 20.120096] Modules linked in: encrypted_keys(+) btrfs xor raid6_pq crc32c_intel sdhci_acpi sdhci xhci_hcd mmc_core video button sg trusted tpm [ 20.120099] CPU: 1 PID: 442 Comm: systemd-modules Not tainted 3.17.4-2.g2d23787-desktop #1 [ 20.120100] Hardware name: Dell Inc. XPS13 9333/0GFTRT, BIOS A02 12/11/2013 [ 20.120102] task: ffff8800377c6090 ti: ffff8802135ec000 task.ti: ffff8802135ec000 [ 20.120106] RIP: 0010:[<ffffffff812a3188>] [<ffffffff812a3188>] register_key_type+0x48/0xb0 [ 20.120107] RSP: 0018:ffff8802135efd10 EFLAGS: 00010202 [ 20.120108] RAX: ffff8800377c6090 RBX: ffffffffa001b000 RCX: 0000000000000000 [ 20.120109] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff81e91a00 [ 20.120110] RBP: ffffffffa006115a R08: 000000000000ffff R09: ffff880037650ec0 [ 20.120111] R10: 0000000000000029 R11: 00000000000022cc R12: ffffffffa001b070 [ 20.120113] R13: ffffffffa0062000 R14: ffffffffa00620d0 R15: 0000000000000001 [ 20.120115] FS: 00007f7897799740(0000) GS:ffff88021f240000(0000) knlGS:0000000000000000 [ 20.120116] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 20.120117] CR2: ffffffffa001b000 CR3: 0000000213a8b000 CR4: 00000000001407e0 [ 20.120118] Stack: [ 20.120121] ffffffff81e1b020 ffff8802120d4340 0000000000000000 ffffffffa000b000 [ 20.120123] ffffffffa000b06f ffff8802920d4340 ffffffff81e1b020 ffffffff8100030c [ 20.120126] ffff88021f4ea000 ffffffffa00620d0 0000000000000001 0000000000000282 [ 20.120126] Call Trace: [ 20.120135] [<ffffffffa000b06f>] init_encrypted+0x6f/0x1000 [encrypted_keys] [ 20.120145] [<ffffffff8100030c>] do_one_initcall+0xcc/0x200 [ 20.120153] [<ffffffff810db4d2>] load_module+0x2372/0x2700 [ 20.120160] [<ffffffff810db9b5>] SYSC_finit_module+0x75/0xa0 [ 20.120165] [<ffffffff816354ad>] system_call_fastpath+0x1a/0x1f [ 20.120169] [<00007f7896eb8829>] 0x7f7896eb8829 [ 20.120183] Code: cf e8 be 00 49 81 fc 30 1a e9 81 49 8d 5c 24 90 74 42 49 8b 6d 00 eb 13 0f 1f 00 48 8b 43 70 48 3d 30 1a e9 81 48 8d 58 90 74 29 <48> 8b 3b 48 89 ee e8 2d 5b 08 00 85 c0 75 e1 bb ef ff ff ff 48 [ 20.120185] RIP [<ffffffff812a3188>] register_key_type+0x48/0xb0 [ 20.120186] RSP <ffff8802135efd10> [ 20.120186] CR2: ffffffffa001b000 [ 20.120188] ---[ end trace 35b66b44b468cfb4 ]--- --<--
Created attachment 615875 [details] diff kernel config between failing packaged kernel and working kernel Okay, I've got some news. The git version of 3.17.4 I compiled from git didn't have 100% the same config like the package. I seem to have used the config from 3.16.4-1-g7a8842b-desktop. To make sure to get consistent results I recompiled the git version using the correct config file, and now I'm getting the same systemd error for BOTH kernels. When compiling I simply copied over config-3.17.4-1.g7a8842b-desktop to .config, ran "make menuconfig" then saved directly. I have attached the diff between the configs. config-3.17.4-2.g2d23787-desktop is the one from the SUSE package. config-3.17.4-1.g7a8842b-desktop is the one of the working kernel from git (the one I mistakenly took from 3.16.4) It looks like some crypto modules are present in the SUSE package. As a next step I can try and blacklist them / exclude them from the compiled kernel to find out which one it is.
Created attachment 615877 [details] verbose dmesg from systemd, with i915 unblacklisted Since I mistakenly blacklisted i915, I redid the verbose systemd run and have updated the attachment. It still seems to have the crypto fail: [ 14.138948] Modules linked in: encrypted_keys(+) btrfs i915 i2c_algo_bit xor drm_kms_helper drm raid6_pq crc32c_intel xhci_hcd video sdhci_acpi sdhci mmc_core button sg trusted tpm [ 14.138951] CPU: 1 PID: 438 Comm: systemd-modules Not tainted 3.17.4-2.g2d23787-desktop #1 [ 14.138952] Hardware name: Dell Inc. XPS13 9333/0GFTRT, BIOS A02 12/11/2013 [ 14.138954] task: ffff880212d74150 ti: ffff880213a74000 task.ti: ffff880213a74000 [ 14.138958] RIP: 0010:[<ffffffff812a3188>] [<ffffffff812a3188>] register_key_type+0x48/0xb0 [ 14.138959] RSP: 0018:ffff880213a77d10 EFLAGS: 00010202 [ 14.138960] RAX: ffff880212d74150 RBX: ffffffffa001b000 RCX: 0000000000000000 [ 14.138961] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff81e91a00 [ 14.138963] RBP: ffffffffa00a015a R08: 000000000000ffff R09: ffff8800377581c0 [ 14.138964] R10: 0000000000000029 R11: 00000000000022cc R12: ffffffffa001b070 [ 14.138965] R13: ffffffffa00a1000 R14: ffffffffa00a10d0 R15: 0000000000000001 [ 14.138967] FS: 00007f28a173e740(0000) GS:ffff88021f240000(0000) knlGS:0000000000000000 [ 14.138968] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 14.138970] CR2: ffffffffa001b000 CR3: 0000000213558000 CR4: 00000000001407e0 [ 14.138971] Stack: [ 14.138973] ffffffff81e1b020 ffff8800374284a0 0000000000000000 ffffffffa00ab000 [ 14.138976] ffffffffa00ab06f ffff8800b74284a0 ffffffff81e1b020 ffffffff8100030c [ 14.138978] ffff88021f4ea000 ffffffffa00a10d0 0000000000000001 0000000000000286 [ 14.138979] Call Trace: [ 14.138988] [<ffffffffa00ab06f>] init_encrypted+0x6f/0x1000 [encrypted_keys] [ 14.138998] [<ffffffff8100030c>] do_one_initcall+0xcc/0x200 [ 14.139006] [<ffffffff810db4d2>] load_module+0x2372/0x2700 [ 14.139013] [<ffffffff810db9b5>] SYSC_finit_module+0x75/0xa0 [ 14.139020] [<ffffffff816354ad>] system_call_fastpath+0x1a/0x1f [ 14.139025] [<00007f28a0e5d829>] 0x7f28a0e5d829 [ 14.139051] Code: cf e8 be 00 49 81 fc 30 1a e9 81 49 8d 5c 24 90 74 42 49 8b 6d 00 eb 13 0f 1f 00 48 8b 43 70 48 3d 30 1a e9 81 48 8d 58 90 74 29 <48> 8b 3b 48 89 ee e8 2d 5b 08 00 85 c0 75 e1 bb ef ff ff ff 48 [ 14.139055] RIP [<ffffffff812a3188>] register_key_type+0x48/0xb0 [ 14.139055] RSP <ffff880213a77d10> [ 14.139056] CR2: ffffffffa001b000 [ 14.139058] ---[ end trace dc897a0c60632a0e ]---
I did another test run but changed SHA256 to be a module instead of compiled in. And now it works fine! diff brokenconfig workingconfig 528,529d527 < CONFIG_KEXEC_FILE=y < CONFIG_KEXEC_VERIFY_SIG=y 6550c6548 < CONFIG_CRYPTO_SHA256=y --- > CONFIG_CRYPTO_SHA256=m Not sure how these "KEXEC" parts sneaked in though. Has this been changed recently to have it compiled in ? (at least it seems it wasn't in 3.16.4)
wow. That is a kernel bug during the loading of encrypted-keys.ko happening in the register_key_type function from linux/security/keys/key.c
The kernel oops itself is possibly a result of memory corruption. init_encrypted() may return an error from aes_get_sizes() at its end, but there is no resource management at the error path. That is, if the load of this module fails in initrd, it'll be loaded again in /usr/lib/modules-loaded.d/*, and it hits the broken item in the list. A fix patch is below. But, this is under assumption that the aes_get_sizes() call failed. That is, the corresponding blkcipher algorithm ('cbc(aes)") isn't available by some reason.
Created attachment 615931 [details] test fix patch
Created attachment 615932 [details] Revisted test patch
I could reproduce the issue on a VM here, too. I had to add encrypted-keys module to initrd and /etc/modules-load.d/*. Checking now whether the patch can really fix it...
OK, the patch seems working with a quick test. I'm going to submit the fix to upstream.
I applied the revised patch on top of git 3.17.4 using the stock config (the one that had SHA256 compiled in) and I can confirm that it WORKS on my laptop as well :-)
Thanks for testing. I'll merge the patch once when it's accepted by the upstream.
I merged the patch now to SUSE git branches: openSUSE-13.1, openSUSE-13.2, stable and SLE12.
It looks like the patch hasn't caught up on openSUSE Factory yet. I tried installing: i | kernel-desktop | package | 3.17.4-1.2 | x86_64 | repo-oss but am still getting the error. However the kernel from the "Kernel:stable:standard" repo has it: i | kernel-desktop | package | 3.17.4-4.1.g8622a2e | x86_64 | Kernel:stable I suppose that this was not omitted and the factory package will catch up eventually.
FACTORY doesn't take the latest kernel package always. Keep using OBS Kernel:stable repo.
This is an autogenerated message for OBS integration: This bug (908163) was mentioned in https://build.opensuse.org/request/show/264975 13.2 / kernel-source
openSUSE-SU-2014:1677-1: An update that solves 31 vulnerabilities and has 12 fixes is now available. Category: security (important) Bug References: 818966,835839,853040,856659,864375,865882,873790,875051,881008,882639,882804,883518,883724,883948,883949,884324,887046,887082,889173,890114,891689,892490,893429,896382,896385,896390,896391,896392,896689,897736,899785,900392,902346,902349,902351,904013,904700,905100,905744,907818,908163,909077,910251 CVE References: CVE-2013-2891,CVE-2013-2898,CVE-2014-0181,CVE-2014-0206,CVE-2014-1739,CVE-2014-3181,CVE-2014-3182,CVE-2014-3184,CVE-2014-3185,CVE-2014-3186,CVE-2014-3673,CVE-2014-3687,CVE-2014-3688,CVE-2014-4171,CVE-2014-4508,CVE-2014-4608,CVE-2014-4611,CVE-2014-4943,CVE-2014-5077,CVE-2014-5206,CVE-2014-5207,CVE-2014-5471,CVE-2014-5472,CVE-2014-6410,CVE-2014-7826,CVE-2014-7841,CVE-2014-7975,CVE-2014-8133,CVE-2014-8709,CVE-2014-9090,CVE-2014-9322 Sources used: openSUSE 13.1 (src): cloop-2.639-11.16.1, crash-7.0.2-2.16.1, hdjmod-1.28-16.16.1, ipset-6.21.1-2.20.1, iscsitarget-1.4.20.3-13.16.1, kernel-docs-3.11.10-25.2, kernel-source-3.11.10-25.1, kernel-syms-3.11.10-25.1, ndiswrapper-1.58-16.1, pcfclock-0.44-258.16.1, vhba-kmp-20130607-2.17.1, virtualbox-4.2.18-2.21.1, xen-4.3.2_02-30.1, xtables-addons-2.3-2.16.1
openSUSE-SU-2014:1678-1: An update that solves 8 vulnerabilities and has 22 fixes is now available. Category: security (important) Bug References: 665315,856659,897112,897736,900786,902346,902349,902351,902632,902633,902728,903748,903986,904013,904097,904289,904417,904539,904717,904932,905068,905100,905329,905739,906914,907818,908163,908253,909077,910251 CVE References: CVE-2014-3673,CVE-2014-3687,CVE-2014-3688,CVE-2014-7826,CVE-2014-7841,CVE-2014-8133,CVE-2014-9090,CVE-2014-9322 Sources used: openSUSE 13.2 (src): kernel-docs-3.16.7-7.2, kernel-obs-build-3.16.7-7.3, kernel-obs-qa-3.16.7-7.2, kernel-obs-qa-xen-3.16.7-7.2, kernel-source-3.16.7-7.1, kernel-syms-3.16.7-7.1
SUSE-SU-2015:0178-1: An update that solves 5 vulnerabilities and has 59 fixes is now available. Category: security (important) Bug References: 800255,809493,829110,856659,862374,873252,875220,884407,887108,887597,889192,891086,891277,893428,895387,895814,902232,902346,902349,903279,903640,904053,904177,904659,904969,905087,905100,906027,906140,906545,907069,907325,907536,907593,907714,907818,907969,907970,907971,907973,908057,908163,908198,908803,908825,908904,909077,909092,909095,909829,910249,910697,911181,911325,912129,912278,912281,912290,912514,912705,912946,913233,913387,913466 CVE References: CVE-2014-3687,CVE-2014-3690,CVE-2014-8559,CVE-2014-9420,CVE-2014-9585 Sources used: SUSE Linux Enterprise Software Development Kit 12 (src): kernel-docs-3.12.36-38.3, kernel-obs-build-3.12.36-38.2 SUSE Linux Enterprise Server 12 (src): kernel-source-3.12.36-38.1, kernel-syms-3.12.36-38.1 SUSE Linux Enterprise Desktop 12 (src): kernel-source-3.12.36-38.1, kernel-syms-3.12.36-38.1
An update workflow for this issue was started. This issue was rated as important. Please submit fixed packages until 2015-03-05. When done, reassign the bug to security-team@suse.de. https://swamp.suse.de/webswamp/wf/60808
SUSE-SU-2015:0581-1: An update that solves 21 vulnerabilities and has 67 fixes is now available. Category: security (important) Bug References: 771619,816099,829110,833588,833820,846656,853040,856760,864401,864404,864409,864411,865419,875051,876086,876594,877593,882470,883948,884817,887597,891277,894213,895841,896484,900279,900644,902232,902349,902351,902675,903096,903640,904053,904242,904659,904671,905304,905312,905799,906586,907196,907338,907551,907611,907818,908069,908163,908393,908550,908551,908572,908825,909077,909078,909088,909092,909093,909095,909264,909565,909740,909846,910013,910150,910159,910321,910322,910517,911181,911325,911326,912171,912705,913059,914355,914423,914726,915209,915322,915335,915791,915826,916515,916982,917839,917884,920250 CVE References: CVE-2013-7263,CVE-2014-0181,CVE-2014-3687,CVE-2014-3688,CVE-2014-3690,CVE-2014-4608,CVE-2014-7822,CVE-2014-7842,CVE-2014-7970,CVE-2014-8133,CVE-2014-8134,CVE-2014-8160,CVE-2014-8369,CVE-2014-8559,CVE-2014-9090,CVE-2014-9322,CVE-2014-9419,CVE-2014-9420,CVE-2014-9584,CVE-2014-9585,CVE-2015-1593 Sources used: SUSE Linux Enterprise Server 11 SP3 for VMware (src): kernel-bigsmp-3.0.101-0.47.50.1, kernel-default-3.0.101-0.47.50.1, kernel-pae-3.0.101-0.47.50.1, kernel-source-3.0.101-0.47.50.1, kernel-syms-3.0.101-0.47.50.1, kernel-trace-3.0.101-0.47.50.1, kernel-xen-3.0.101-0.47.50.1 SUSE Linux Enterprise Server 11 SP3 (src): kernel-bigsmp-3.0.101-0.47.50.1, kernel-default-3.0.101-0.47.50.1, kernel-ec2-3.0.101-0.47.50.1, kernel-pae-3.0.101-0.47.50.1, kernel-ppc64-3.0.101-0.47.50.1, kernel-source-3.0.101-0.47.50.1, kernel-syms-3.0.101-0.47.50.1, kernel-trace-3.0.101-0.47.50.1, kernel-xen-3.0.101-0.47.50.1, xen-4.2.5_04-0.7.1 SUSE Linux Enterprise High Availability Extension 11 SP3 (src): cluster-network-1.4-2.28.1.7, gfs2-2-0.17.1.7, ocfs2-1.6-0.21.1.7 SUSE Linux Enterprise Desktop 11 SP3 (src): kernel-bigsmp-3.0.101-0.47.50.1, kernel-default-3.0.101-0.47.50.1, kernel-pae-3.0.101-0.47.50.1, kernel-source-3.0.101-0.47.50.1, kernel-syms-3.0.101-0.47.50.1, kernel-trace-3.0.101-0.47.50.1, kernel-xen-3.0.101-0.47.50.1, xen-4.2.5_04-0.7.1 SLE 11 SERVER Unsupported Extras (src): kernel-bigsmp-3.0.101-0.47.50.1, kernel-default-3.0.101-0.47.50.1, kernel-pae-3.0.101-0.47.50.1, kernel-ppc64-3.0.101-0.47.50.1, kernel-xen-3.0.101-0.47.50.1
SUSE-SU-2015:0736-1: An update that solves 21 vulnerabilities and has 69 fixes is now available. Category: security (important) Bug References: 771619,816099,829110,833588,833820,846656,853040,856760,864401,864404,864409,864411,865419,875051,876086,876594,877593,882470,883948,884817,887597,891277,894213,895841,896484,900279,900644,902232,902349,902351,902675,903096,903640,904053,904242,904659,904671,905304,905312,905799,906586,907196,907338,907551,907611,907818,908069,908163,908393,908550,908551,908572,908825,909077,909078,909088,909092,909093,909095,909264,909565,909740,909846,910013,910150,910159,910251,910321,910322,910517,911181,911325,911326,912171,912705,913059,914355,914423,914726,915209,915322,915335,915791,915826,916515,916982,917839,917884,920250,924282 CVE References: CVE-2013-7263,CVE-2014-0181,CVE-2014-3687,CVE-2014-3688,CVE-2014-3690,CVE-2014-4608,CVE-2014-7822,CVE-2014-7842,CVE-2014-7970,CVE-2014-8133,CVE-2014-8134,CVE-2014-8160,CVE-2014-8369,CVE-2014-8559,CVE-2014-9090,CVE-2014-9322,CVE-2014-9419,CVE-2014-9420,CVE-2014-9584,CVE-2014-9585,CVE-2015-1593 Sources used: SUSE Linux Enterprise Real Time Extension 11 SP3 (src): cluster-network-1.4-2.28.1.14, drbd-kmp-8.4.4-0.23.1.14, iscsitarget-1.4.20-0.39.1.14, kernel-rt-3.0.101.rt130-0.33.36.1, kernel-rt_trace-3.0.101.rt130-0.33.36.1, kernel-source-rt-3.0.101.rt130-0.33.36.1, kernel-syms-rt-3.0.101.rt130-0.33.36.1, lttng-modules-2.1.1-0.12.1.13, ocfs2-1.6-0.21.1.14, ofed-1.5.4.1-0.14.1.14