Bug 1163684 - Kernel oops in s390x tumbleweed
Kernel oops in s390x tumbleweed
Status: RESOLVED FIXED
Classification: openSUSE
Product: openSUSE Tumbleweed
Classification: openSUSE
Component: Kernel
Current
S/390-64 Other
: P2 - High : Normal (vote)
: ---
Assigned To: Petr Tesařík
E-mail List
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2020-02-14 11:47 UTC by Berthold Gunreben
Modified: 2020-12-31 07:16 UTC (History)
17 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments
tar archive of buildlogs with hanging build jobs (2.32 MB, application/x-xz)
2020-03-04 16:30 UTC, Berthold Gunreben
Details
buildlog with hanging build job from gettext-runtime-mini (9.27 KB, application/gzip)
2020-07-31 12:08 UTC, Berthold Gunreben
Details
another hanging worker with gettext-runtime (24.39 KB, application/gzip)
2020-07-31 13:24 UTC, Berthold Gunreben
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Berthold Gunreben 2020-02-14 11:47:14 UTC
Since a while, the buildservice encounters a certain type of kernel oops, which often goes away when doing a restartbuild of the respective package.

Since the php7:test package seems to encounter the issue more consistently, it seems a good candidate to reproduce the issue:

osc rbl openSUSE:Factory:zSystems php7:test standard s390x

 PASS Leak 001: Incorrect 'if ();' optimization [ext/opcache/tests/leak_001.phpt] 
[ 1471.342879] Unable to handle kernel pointer dereference in virtual kernel address space
[ 1483s] [ 1471.343112] Failing address: 000003d280000000 TEID: 000003d280000803
[ 1483s] [ 1471.343246] Fault in home space mode while using kernel ASCE.
[ 1483s] [ 1471.343339] AS:0000000063acc007 R3:0000000000000024 
[ 1483s] [ 1471.343495] Oops: 003b ilc:2 [#1] SMP 
[ 1483s] [ 1471.343577] Modules linked in: sha256_s390 sha_common sd_mod nls_iso8859_1 nls_cp437 vfat fat virtio_rng rng_core virtio_blk xfs btrfs blake2b_generic xor raid6_pq libcrc32c reiserfs squashfs fuse dm_snapshot dm_bufio dm_crypt dm_mod binfmt_misc loop sg scsi_mod
[ 1483s] [ 1471.343946] CPU: 1 PID: 14557 Comm: php Not tainted 5.5.2-1-default #1 openSUSE Tumbleweed (unreleased)
[ 1483s] [ 1471.344053] Hardware name: IBM 2827 H43 400 (KVM/Linux)
[ 1483s] [ 1471.344098] Krnl PSW : 0704c00180000000 0000000062a9d2ec (page_table_free_rcu+0x7c/0x150)
[ 1483s] [ 1471.344188]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
[ 1483s] [ 1471.344256] Krnl GPRS: 00d3df7e00000004 000003d280000034 00000001334dc8d0 0000000000000002
[ 1483s] [ 1471.344325]            0000040001500000 0000000111000000 0000000000000000 0000000062ca82b8
[ 1483s] [ 1471.344393]            000003e0061d7c18 00000001334dc600 0000008000000000 000003d280000000
[ 1483s] [ 1471.344471]            0000000134d13e00 0000000063352d30 0000000062a9d2c6 000003e0061d79e8
[ 1483s] [ 1471.344586] Krnl Code: 0000000062a9d2e0: a7580011           lhi     %r5,17
[ 1483s] [ 1471.344586]            0000000062a9d2e4: 4110b034           la      %r1,52(%r11)
[ 1483s] [ 1471.344586]           #0000000062a9d2e8: 89506018           sll     %r5,24(%r6)
[ 1483s] [ 1471.344586]           >0000000062a9d2ec: 58201000           l       %r2,0(%r1)
[ 1483s] [ 1471.344586]            0000000062a9d2f0: b9f72035           xrk     %r3,%r5,%r2
[ 1483s] [ 1471.344586]            0000000062a9d2f4: 1842               lr      %r4,%r2
[ 1483s] [ 1471.344586]            0000000062a9d2f6: ba431000           cs      %r4,%r3,0(%r1)
[ 1483s] [ 1471.344586]            0000000062a9d2fa: ec24fff96076       crj     %r2,%r4,6,0000000062a9d2ec
[ 1483s] [ 1471.345092] Call Trace:
[ 1483s] [ 1471.345127]  [<0000000062a9d2ec>] page_table_free_rcu+0x7c/0x150 
[ 1483s] [ 1471.345231]  [<0000000062ca82b8>] free_pgd_range+0x2d8/0x680 
[ 1483s] [ 1471.345304]  [<0000000062ca86de>] free_pgtables+0x7e/0x140 
[ 1483s] [ 1471.345359]  [<0000000062cb2c7e>] unmap_region+0xde/0x120 
[ 1483s] [ 1471.345402]  [<0000000062cb73c2>] mmap_region+0x662/0x700 
[ 1483s] [ 1471.345457]  [<0000000062cb776e>] do_mmap+0x30e/0x4d0 
[ 1483s] [ 1471.345503]  [<0000000062c8bcd0>] vm_mmap_pgoff+0xc0/0x120 
[ 1483s] [ 1471.345557]  [<0000000062cb46f4>] ksys_mmap_pgoff+0x124/0x270 
[ 1483s] [ 1471.345612]  [<0000000062cb49e2>] __s390x_sys_old_mmap+0x72/0xa0 
[ 1483s] [ 1471.345669]  [<00000000633365f4>] system_call+0xd8/0x2c8 
[ 1483s] [ 1471.345720] Last Breaking-Event-Address:
[ 1483s] [ 1471.345756]  [<0000000062abea52>] __local_bh_disable_ip+0x52/0x60
[ 1483s] [ 1471.345826] Kernel panic - not syncing: Fatal exception in interrupt
[ 1484s] ### VM INTERACTION END ###
[ 1484s] No buildstatus set, either the base system is broken (kernel/initrd/udev/glibc/bash/perl)
[ 1484s] or the build host has a kernel or hardware problem...

gave up after 9 failed build attempts...

Building the package on a SLES12 kernel did work without problems.

The last time the package php7:test built successfully was 
2020-02-04 10:24:13   669482aecc8852f512a26aa6d813bd41    7.4.2-1.4        72       3362

Since php7:test is quite late in the build tree, I guess that this problem was introduced with Kernel 5.5.1.
Comment 1 Berthold Gunreben 2020-03-04 16:30:08 UTC
Created attachment 831930 [details]
tar archive of buildlogs with hanging build jobs

Let me give a small update. I collected almost 50 buildlogs that had some sort of Kernel Ooops in the log. I'll attach that to this bug.

This is just to demonstrate the bad condition of the current tumbleweed kernel. 

Please tell me if I can help somehow to improve the situation.
Comment 2 LTC BugProxy 2020-03-05 18:30:48 UTC
------- Comment From geraldsc@de.ibm.com 2020-03-05 13:29 EDT-------
This crashes on the &page->_refcount access in page_table_free_rcu():
mask = atomic_xor_bits(&page->_refcount, 0x11U << (bit + 24));

The struct page pointer is calculated from the unsigned long *table function parameter, so if the page pointer is corrupt, then because of a corrupt table pointer.

From the register dump here and a disassembly of a kernel 5.5.2 version of page_table_free_rcu(), it seems that we have the following values:

%r11 == 000003d280000000 == page
%r1  == 000003d280000034 == page->_refcount
%r9  == 00000001334dc600 == table (saved to %r9 from the original %r3)

The table pointer 0x1334dc600 is at about 4,8 GB, does that sound reasonable, i.e. does the machine have at least 4,8 GB of memory?

Are there any 2 GB hugepages involved on the machine? We do have a known issue with that, which could result in corrupt pagetables. The fix for that went in to 5.5.3, so if the answer here is yes, please try again with at least kernel 5.5.3.

If this is easily reproducible, and did not occur on 5.5.1, could you perhaps try a bisect to identify which patch broke it?
Comment 3 LTC BugProxy 2020-03-05 19:30:44 UTC
------- Comment From geraldsc@de.ibm.com 2020-03-05 14:25 EDT-------
(In reply to comment #8)
[...]
>
> From the register dump here and a disassembly of a kernel 5.5.2 version of
> page_table_free_rcu(), it seems that we have the following values:
>
> %r11 == 000003d280000000 == page
> %r1  == 000003d280000034 == page->_refcount
> %r9  == 00000001334dc600 == table (saved to %r9 from the original %r3)
>
> The table pointer 0x1334dc600 is at about 4,8 GB, does that sound
> reasonable, i.e. does the machine have at least 4,8 GB of memory?

Sorry, I was making wrong assumptions w/o having the real disassembly from your specific kernel. The table pointer cannot be in %r9, but (most likely :-) in %r10, which is 0000008000000000 == 512 GB. So, does the machine have at least 512 GB, or is this already a hint to corrupted page tables?

For further analysis we would really need a dump, so that we do not need to make assumptions, or some way to reproduce it here locally. Or even better, if you could do a bisect.
Comment 4 Berthold Gunreben 2020-03-05 21:34:21 UTC
Just to explain the environment a little more:

This is workers within open build service. This means, all of the workers are KVM virtual machines with just the single purpose of building a single RPM package. Those workers have different sizes to allow the OBS to obey package constraints for building, however there is probably no worker in there smaller than 4G, likely much bigger. Rudi should be able to tell more about them.

My assumption about php7:test is unfortunately incorrect. The issues are with all kinds of packages at start, within the build, or at the end of a long running build... (like building several hours without problem and then running into issues).

I just restarted 14 workers that had similar issues, however there will probably be no new types of issues (see attached buildlogs).

If it helps, it should be possible to get a copy of a suspended machine to debug.

Rudi, could you try to get such an image along with the command how this has been run?
Comment 5 LTC BugProxy 2020-03-06 18:00:50 UTC
------- Comment From geraldsc@de.ibm.com 2020-03-06 12:56 EDT-------
(In reply to comment #13)
> Just to explain the environment a little more:
>
> This is workers within open build service. This means, all of the workers
> are KVM virtual machines with just the single purpose of building a single
> RPM package. Those workers have different sizes to allow the OBS to obey
> package constraints for building, however there is probably no worker in
> there smaller than 4G, likely much bigger. Rudi should be able to tell more
> about them.
>
> My assumption about php7:test is unfortunately incorrect. The issues are
> with all kinds of packages at start, within the build, or at the end of a
> long running build... (like building several hours without problem and then
> running into issues).
>
> I just restarted 14 workers that had similar issues, however there will
> probably be no new types of issues (see attached buildlogs).
>
> If it helps, it should be possible to get a copy of a suspended machine to
> debug.
>
> Rudi, could you try to get such an image along with the command how this has
> been run?

OK, that sounds like it is not easily bisectable. Still, it would be very interesting to know if this only happens on some specific stable version of 5.5.

The problem is that this kernel crash here is just one possible symptom of the apparently corrupted pagetables. If the corrupted entry would not happen to point to unaccessible memory, it would result in arbitrary memory corruption, which could go more or less unnoticed. At least on s390, because we write to a struct page in page_table_free_rcu(). Not sure how it would affect other architectures if this is a common code issue.

Although the crash happens in s390 code pte_free_tlb()/page_table_free_rcu(), the root cause is a broken "token" input value, which is passed in from common code free_pte_range(). This value is read from the pagetable, so it apparently is corrupted. The reason could be s390-specific, but also common code related.

If it was a common code issue, it probably would affect other architectures, but that would depend on how they behave in pte_free_tlb(). Does the SUSE build service show any issues on other architectures with this specific (KVM guest and host) kernel version?

It might be a good idea to have some common code memory management expert from Red Hat looing at this. At least from the call trace here it looks like unmap_region()/free_pgtables() is called on one specific error path in mmap_region() for file mappings:

/* ->mmap() can change vma->vm_file, but must guarantee that
* vma_link() below can deny write-access if VM_DENYWRITE is set
* and map writably if VM_SHARED is set. This usually means the
* new file must not have been exposed to user-space, yet.
*/
vma->vm_file = get_file(file);
error = call_mmap(file, vma);
if (error)
goto unmap_and_free_vma;

It would be interesting to know if all crashes show a similar call trace, or if we have a more generic issue.

[ 1483s] [ 1471.345231]  [<0000000062ca82b8>] free_pgd_range+0x2d8/0x680
[ 1483s] [ 1471.345304]  [<0000000062ca86de>] free_pgtables+0x7e/0x140
[ 1483s] [ 1471.345359]  [<0000000062cb2c7e>] unmap_region+0xde/0x120
[ 1483s] [ 1471.345402]  [<0000000062cb73c2>] mmap_region+0x662/0x700
[ 1483s] [ 1471.345457]  [<0000000062cb776e>] do_mmap+0x30e/0x4d0
Comment 6 LTC BugProxy 2020-03-06 18:10:42 UTC
------- Comment From geraldsc@de.ibm.com 2020-03-06 13:00 EDT-------
(In reply to comment #14)
> (In reply to comment #13)
[...]
>
> It might be a good idea to have some common code memory management expert
> from XXX looing at this. At least from the call trace here it looks like

Sorry, too many memory management bugs from too many distributors :-)
Please replace with SUSE.
Comment 7 Ruediger Oertel 2020-03-08 10:05:26 UTC
I've scheduled a tarball with dump of such a VM including buildlog, root and swap device to  https://users.suse.com/~ro/IBM/vm_save.tar.xz
- I do not know exactly when this is synced out, usually once a day.
- the xz compressed tarball is 297M, uncompressed about 4G and the files inside are huge, tar was created with --sparse.
Comment 8 Michal Hocko 2020-03-09 14:38:54 UTC
(In reply to LTC BugProxy from comment #5)
[...]
> It might be a good idea to have some common code memory management expert
> from Red Hat looing at this. At least from the call trace here it looks like
> unmap_region()/free_pgtables() is called on one specific error path in
> mmap_region() for file mappings:
> 
> /* ->mmap() can change vma->vm_file, but must guarantee that
> * vma_link() below can deny write-access if VM_DENYWRITE is set
> * and map writably if VM_SHARED is set. This usually means the
> * new file must not have been exposed to user-space, yet.
> */
> vma->vm_file = get_file(file);
> error = call_mmap(file, vma);
> if (error)
> goto unmap_and_free_vma;

I have checked the generic code. The only head scratcher is that the vma->vm_file is reset to NULL along with fput on the file before unmapping the region as that means that unlink_file_vma is noop. The page table tear down itself shouldn't really depend on the vm_file at all though. The code is like that for years.

I have hard time believing that the final fput would somehow deallocated page tables as that would require rmap for the mapping associated with the file. If this is really a general pattern then I would suspect that the file's mapping specific mmap implementation does something wrong and then the unmap path just trips over it. Do we know what is behind the `file'?
Comment 9 LTC BugProxy 2020-03-11 19:21:12 UTC
------- Comment From geraldsc@de.ibm.com 2020-03-11 15:12 EDT-------
(In reply to comment #16)
> I've scheduled a tarball with dump of such a VM including buildlog, root and
> swap device to  https://users.suse.com/~ro/IBM/vm_save.tar.xz
> - I do not know exactly when this is synced out, usually once a day.
> - the xz compressed tarball is 297M, uncompressed about 4G and the files
> inside are huge, tar was created with --sparse.

Thanks, but I think I cannot open this dump w/o a vmlinux plus debuginfo.
I also don't need any root or swap images, just the dump and matching vmlinux with and without debuginfo.

However, I now had a look into the initial attachment here, with the failing buildlogs, and this answers at least one of my questions: There is a huge variety of different call traces, so this is not related to the specific error path in mmap_region() for file mappings, but rather a more generic issue with many different symptoms.

Some of the failing buildlogs show a crash in pgtable_trans_huge_withdraw(), which is called on THP splitting. From the backtrace it looks like the (supposedly pre-allocated) pagetable that we read in pgtable_trans_huge_withdraw() via pmd_huge_pte(mm, pmdp) is not the one that was pre-allocated and initialized in pgtable_trans_huge_deposit(). This could be a hint towards a common code issue, since I currently do not see how we could mess this up in arch code.

So, could you try to narrow this down on THP impact by disabling THP for the KVM guests, e.g. by adding the "transparent_hugepage=never" kernel parameter for the guests? The panics do always happen in the guests, not the host, right?

I'd also like to sum up some of my previous questions here, for the sake of clarity:

- Any information which kernel version introduced this behaviour?
- Do you see strange behaviour on other architectures?
Comment 10 Berthold Gunreben 2020-03-26 11:13:30 UTC
(In reply to LTC BugProxy from comment #9)
> So, could you try to narrow this down on THP impact by disabling THP for the
> KVM guests, e.g. by adding the "transparent_hugepage=never" kernel parameter
> for the guests? The panics do always happen in the guests, not the host,
> right?

Sorry that I did not notice the comment earlier. Rudi updated the obs-worker yesterday and added transparent_hugepage=never to the guest kernel command line. 

A tentative result is, that there were no hangs anymore with this parameter enabled. I will update here if the situation changes.
Comment 11 Berthold Gunreben 2020-03-30 10:36:06 UTC
Just let me confirm again: since transparent_hugepage=never is active, no oops has happend. Looks like there is some issue there. 

If you have some more ideas how to debug that further, please tell me. BTW: I am not that much into other architectures. But from what I saw so far, those issues did not happen on other architectures (all of which seem to be little endian tough).
Comment 12 LTC BugProxy 2020-04-02 08:25:39 UTC
------- Comment From geraldsc@de.ibm.com 2020-04-02 04:17 EDT-------
(In reply to comment #20)
> Just let me confirm again: since transparent_hugepage=never is active, no
> oops has happend. Looks like there is some issue there.
>
> If you have some more ideas how to debug that further, please tell me. BTW:
> I am not that much into other architectures. But from what I saw so far,
> those issues did not happen on other architectures (all of which seem to be
> little endian tough).

Thanks, that narrows it down. Could you please attach the exact kernel config that was used, or can it be found somewhere (I only know how to find it for SLES)?

Also, any information could be useful which would narrow down the affected kernel version. Unfortunately, it seems that we cannot reproduce this easily to verify if it is also an issue in upstream kernel and do a bisect.

So far, we know from the backtrace that it happened on a 5.5.2-1 kernel, and you also wrote "Since php7:test is quite late in the build tree, I guess that this problem was introduced with Kernel 5.5.1". Did it ever work with a 5.5.x kernel, if not, which was the last working version?

Last but not least, a dump could still be useful, but I also need the corresponding vmlinux files with and without debuginfo for analysis.
Comment 13 Miroslav Beneš 2020-07-30 10:54:12 UTC
Berthold, is the issue still happening? Gerald mentioned a known issue with 2G hugepages and 5.5.2 kernel. TW is now on 5.7.x, so it would be nice to know if there is some development. If the issue is still present, could you provide information Gerald asked for, please?

Also CCing Vlastimil for the sake of completeness.
Comment 14 Berthold Gunreben 2020-07-30 11:31:09 UTC
(In reply to Miroslav Beneš from comment #13)
> Berthold, is the issue still happening? Gerald mentioned a known issue with
> 2G hugepages and 5.5.2 kernel. TW is now on 5.7.x, so it would be nice to
> know if there is some development. If the issue is still present, could you
> provide information Gerald asked for, please?
> 
> Also CCing Vlastimil for the sake of completeness.

Unfortunately, I have no idea. Since transparent_hugepage=never has been enabled by default, no such error shows up anymore. Maybe Rudi could remove the option to test if the issues appear again.
Comment 15 Berthold Gunreben 2020-07-31 12:04:15 UTC
(In reply to Miroslav Beneš from comment #13)
> Berthold, is the issue still happening? Gerald mentioned a known issue with
> 2G hugepages and 5.5.2 kernel. TW is now on 5.7.x, so it would be nice to
> know if there is some development. If the issue is still present, could you
> provide information Gerald asked for, please?
> 
> Also CCing Vlastimil for the sake of completeness.

So, we removed transparent_hugepage=never and it didn't take long until we had hanging jobs again. Attaching a log from gettext-runtime-mini. It also seems to affect build speed, there are a number of jobs that seem to still run normal although they have more than 600% buildtime. 

We are reverting back to transparent_hugepage=never in order to have the buildsystem functional.
Comment 16 Berthold Gunreben 2020-07-31 12:08:14 UTC
Created attachment 840244 [details]
buildlog with hanging build job from gettext-runtime-mini
Comment 17 Berthold Gunreben 2020-07-31 13:24:01 UTC
Created attachment 840253 [details]
another hanging worker with gettext-runtime
Comment 18 LTC BugProxy 2020-11-11 14:32:49 UTC
------- Comment From geraldsc@de.ibm.com 2020-11-11 09:24 EDT-------
We found an issue with THP generic code, which would only affect s390, see https://lore.kernel.org/lkml/20201110190329.11920-1-gerald.schaefer@linux.ibm.com/.

The fix is already in linux-next, and it is very likely that this is the root cause for the issue here, as it would explain corruption of our pagetable list.

It has cc stable and fixes tag, so it should end up in SUSE kernels sooner or later. If possible, could you try to reproduce with that patch applied, and THP enabled?
Comment 19 LTC BugProxy 2020-11-25 14:42:55 UTC
------- Comment From geraldsc@de.ibm.com 2020-11-25 09:35 EDT-------
(In reply to comment #28)
> We found an issue with THP generic code, which would only affect s390, see
> https://lore.kernel.org/lkml/20201110190329.11920-1-gerald.schaefer@linux.
> ibm.com/.
>
> The fix is already in linux-next, and it is very likely that this is the
> root cause for the issue here, as it would explain corruption of our
> pagetable list.
>
> It has cc stable and fixes tag, so it should end up in SUSE kernels sooner
> or later. If possible, could you try to reproduce with that patch applied,
> and THP enabled?

The fix is now upstream, commit bfe8cc1db02a ("mm/userfaultfd: do not access vma->vm_mm after calling handle_userfault()")

For inclusion of that fix to SLES 12/15, corresponding bugzillas were opened:
- SLES 12 SP5: LTC bug#189979, SUSE bug#1179204
- SLES 15 SP2/3: LTC bug#189976, SUSE bug#1179206

For openSUSE, I would expect that the fix will be included via its stable/fixes tags.

Since we did not really get a good grasp of this bug here, and it is only my assumption that the patch will fix this issue, please try to reproduce as soon as an openSUSE update including the patch is available.
Comment 20 Jiri Slaby 2020-11-26 06:48:40 UTC
5.9.11 contains the commit. Kernel:stable is currently building it. Submitted to factory:
https://build.opensuse.org/request/show/850892
Comment 21 Berthold Gunreben 2020-11-26 08:02:07 UTC
(In reply to Jiri Slaby from comment #20)
> 5.9.11 contains the commit. Kernel:stable is currently building it.
> Submitted to factory:
> https://build.opensuse.org/request/show/850892

Sorry, I have no possibility to change the kernel. Rudi, could you please give the latest kernel a try on OBS? Please remove transparent_hugepage=never from the command line for the test.
Comment 22 Ruediger Oertel 2020-11-26 09:19:05 UTC
sure, as soon as 5.9.11 has built, at the moment I'm seeing 5.9.10 binaries
in openSUSE:Factory:zSystems kernel-default
Comment 23 Sarah Julia Kriesch 2020-12-14 08:59:40 UTC
@Ruediger Can you verify this bug fix?
Comment 24 Berthold Gunreben 2020-12-15 13:07:26 UTC
(In reply to Sarah Julia Kriesch from comment #23)
> @Ruediger Can you verify this bug fix?

Rudi removed the transparent_hugepages=never parameter yesterday. So far, I did not see any builds with that issue anymore. From my point of view, this issue seems to be fixed.
Comment 25 Sarah Julia Kriesch 2020-12-15 13:30:19 UTC
Thank you for this nice pre-christmas present for all included people! :)
Comment 26 Ruediger Oertel 2020-12-15 13:35:04 UTC
https://github.com/openSUSE/obs-build/pull/642 merged to finalize dropping the parameter
Comment 27 Sarah Kriesch 2020-12-26 15:22:49 UTC
We have got a kernel error message in openQA after dropping the parameter "transparent_hugepages=never" and you can not boot the system any more:

          '[    0.210957] ima: No TPM chip found, activating TPM-bypass!                   ',
          '[    0.210960] ima: Allocated hash algorithm: sha256                            ',
          '[    0.210966] ima: No architecture policies found                              ',
          '[    0.210972] evm: Initialising EVM extended attributes:                       ',
          '[    0.210973] evm: security.selinux                                            ',
          '[    0.210974] evm: security.apparmor                                           ',
          '[    0.210975] evm: security.ima                                                ',
          '[    0.210976] evm: security.capability                                         ',
          '[    0.210977] evm: HMAC attrs: 0x1                                             ',
          '[    0.211243] VFS: Cannot open root device "(null)" or unknown-block(1,0): erro',
          'r -6                                                                            ',
          '[    0.211245] Please append a correct "root=" boot option; here are the availab',
          'le partitions:                                                                  ',
          '[    0.211246] Kernel panic - not syncing: VFS: Unable to mount root fs on unkno',
          'wn-block(1,0)                                                                   ',
          '[    0.211249] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.10.1-1-default #1 ope',
          'nSUSE Tumbleweed                                                                ',
          '[    0.211250] Hardware name: IBM 2964 N63 400 (z/VM 6.4.0)                     ',
          '[    0.211251] Call Trace:                                                      ',
          '[    0.211254]  [<000000003252e15c>] show_stack+0x8c/0xd8                       ',
          '[    0.211256]  [<0000000032533550>] dump_stack+0x90/0xc0                       ',
          '[    0.211258]  [<000000003252eae2>] panic+0x112/0x308                          ',
          '[    0.211261]  [<0000000032a39b98>] mount_block_root+0x2e0/0x368               ',
          '[    0.211263]  [<0000000032a39e0a>] prepare_namespace+0x162/0x198              ',
          '[    0.211265]  [<0000000032a3965a>] kernel_init_freeable+0x2c2/0x2d0           ',
          '[    0.211267]  [<0000000032536692>] kernel_init+0x22/0x150                     ',
          '[    0.211269]  [<0000000032546b20>] ret_from_fork+0x28/0x2c                    ',
          '00: HCPGIR450W CP entered; disabled wait PSW 00020001 80000000 00000000 31977BCE'
Comment 28 Sarah Kriesch 2020-12-26 15:24:19 UTC
openQA reference: https://openqa.opensuse.org/tests/1529101#

@Gerald: Should we continue in this bug or should we create a new one?
Comment 29 Petr Tesařík 2020-12-26 16:47:36 UTC
Hi Sarah,

(In reply to Sarah Kriesch from comment #27)
>           '[    0.211243] VFS: Cannot open root device "(null)" or
> unknown-block(1,0): error -6',
>           '[    0.211245] Please append a correct "root=" boot option; here
> are the available partitions:',
>           '[    0.211246] Kernel panic - not syncing: VFS: Unable to mount
> root fs on unknown-block(1,0)',

This looks quite unrelated to a change in THP. This looks like the kernel did not get any initrd, and since all block drivers are built as modules, there is no way to mount the root filesystem.

Indeed, looking a bit further up, the initrd CMS file was not punched to the virtual punch:

          '    91 *-* \'punch ftpboot initrd t (noh\'',
          '       >>>   "punch ftpboot initrd t (noh"',
          'DMSPUN044E Record exceeds allowable maximum',
          '       +++ RC(32) +++',

I'm not quite sure how to fix this, but I'm confident that adding back "transparent_hugepages=never" to the kernel command line will not help.
Comment 30 Sarah Julia Kriesch 2020-12-27 11:40:34 UTC
Hi Petr,

Thank you for your feedback!
I saw that this bug is s390x specific, because no other architecture has got this Kernel Panic at the moment.

The advantage of this bugreport is, that it is open and mirrored to IBM. So we can communicate directly on this way. In comparison to a new bugreport, this one needs some time. Gerald, who has fixed this bug, wanted to wait before closing. He would accept reports of after-effects in this bug. I know, that it is no best-practice at openSUSE, but an efficient way to communicate directly.

I have created a new bugreport, because that has got no relationship: bsc1180381
Comment 31 LTC BugProxy 2020-12-28 20:01:01 UTC
------- Comment From geraldsc@de.ibm.com 2020-12-28 14:57 EDT-------
(In reply to comment #41)
> Hi Petr,
>
> Thank you for your feedback!
> I saw that this bug is s390x specific, because no other architecture has got
> this Kernel Panic at the moment.
>
> The advantage of this bugreport is, that it is open and mirrored to IBM. So
> we can communicate directly on this way. In comparison to a new bugreport,
> this one needs some time. Gerald, who has fixed this bug, wanted to wait
> before closing. He would accept reports of after-effects in this bug. I
> know, that it is no best-practice at openSUSE, but an efficient way to
> communicate directly.
>
> I have created a new bugreport, because that has got no relationship:
> bsc1180381

Thanks. I thought it was worth waiting a bit because the bug / symptom was very hard (for us) to reproduce. OTOH, from the bug report here, it seems that it was hit more regularly with the SUSE workload, so we could already consider it as fixed if the original issue did not show again until now.

I'll leave the decision about closing to the reporter, as this was a reverse mirror form SUSE.
Comment 32 Sarah Julia Kriesch 2020-12-31 07:16:22 UTC
Hi Gerald,

Thanks for your answer between years! Yes. It this bug seems to be fixed.
We have received a new Linux kernel as a Christmas present. I thought there could be dependencies.

I close this bug now.