Bug 1217074

Summary: crash: invalid structure member offset: kmem_cache_s_num
Product: [openSUSE] PUBLIC SUSE Linux Enterprise Server 15 SP6 Reporter: Petr Cervinka <pcervinka>
Component: KernelAssignee: Kernel Bugs <kernel-bugs>
Status: RESOLVED FIXED QA Contact:
Severity: Major    
Priority: P2 - High CC: dmair, jan.stehlik, mhocko, rtsvetkov, santiago.zarate
Version: unspecifiedFlags: rtsvetkov: SHIP_STOPPER?
Target Milestone: ---   
Hardware: Other   
OS: Other   
URL: https://openqa.suse.de/tests/12774808/modules/kdump_and_crash/steps/75
Whiteboard:
Found By: openQA Services Priority:
Business Priority: Blocker: Yes
Marketing QA Status: --- IT Deployment: ---

Description Petr Cervinka 2023-11-13 09:41:19 UTC
We did standard test scenario for kdump and crash analysis, but analysis on 15-SP6 fails with "crash: invalid structure member offset: 
kmem_cache_s_num"



# echo exit | crash `ls -1t /var/crash/*/vmcore | head -n1` /boot/vmlinux-6.4.0-150600.1-default.gz

crash 7.3.1
Copyright (C) 2002-2021  Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010  IBM Corporation
Copyright (C) 1999-2006  Hewlett-Packard Co
Copyright (C) 2005, 2006, 2011, 2012  Fujitsu Limited
Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
Copyright (C) 2005, 2011, 2020-2021  NEC Corporation
Copyright (C) 1999, 2002, 2007  Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions.  Enter "help copying" to see the conditions.
This program has absolutely no warranty.  Enter "help warranty" for details.
 
NOTE: stdin: not a tty                                                 

GNU gdb (GDB) 7.6
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu"...

WARNING: kernel relocated [526MB]: patching 217087 gdb minimal_symbol values
WARNING: kernel version inconsistency between vmlinux and dumpfile

please wait... (gathering kmem slab cache data)
crash: invalid structure member offset: kmem_cache_s_num
       FILE: memory.c  LINE: 9619  FUNCTION: kmem_cache_init()

[/usr/bin/crash] error trace: 558b352c3106 => 558b35299777 => 558b353673e4 => 558b3536735d


content of /var/crash can be downloaded from http://10.100.12.105:8000/15-sp6-crash/crash.tgz
Comment 1 Petr Cervinka 2023-11-14 08:35:50 UTC
Another issue was observer related to same crash scenario on aarch64:


NOTE: stdin: not a tty

WARNING: VA_BITS: calculated: 46  vmcoreinfo: 48
GNU gdb (GDB) 7.6
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "aarch64-unknown-linux-gnu"...

WARNING: kernel relocated [73967149MB]: patching 196331 gdb minimal_symbol values
crash: seek error: kernel virtual address: ffffc68ae3c55600  type: "kernel_config_data"
WARNING: cannot read kernel_config_data
crash: seek error: kernel virtual address: ffffc68ae539eb10  type: "possible"
WARNING: cannot read cpu_possible_map
crash: seek error: kernel virtual address: ffffc68ae539ebd0  type: "present"
WARNING: cannot read cpu_present_map
crash: seek error: kernel virtual address: ffffc68ae539eab0  type: "online"
WARNING: cannot read cpu_online_map
crash: seek error: kernel virtual address: ffffc68ae539ec38  type: "active"
WARNING: cannot read cpu_active_map
crash: seek error: kernel virtual address: ffffc68ae576d9a8  type: "shadow_timekeeper xtime_sec"
crash: seek error: kernel virtual address: ffffc68ae56ae4e8  type: "init_uts_ns"
Comment 2 Santiago Zarate 2023-11-20 10:00:40 UTC
(In reply to Petr Cervinka from comment #1)
> Another issue was observer related to same crash scenario on aarch64:
> 
> 
> NOTE: stdin: not a tty
> 
> WARNING: VA_BITS: calculated: 46  vmcoreinfo: 48
> GNU gdb (GDB) 7.6
> Copyright (C) 2013 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
> and "show warranty" for details.
> This GDB was configured as "aarch64-unknown-linux-gnu"...
> 
> WARNING: kernel relocated [73967149MB]: patching 196331 gdb minimal_symbol
> values
> crash: seek error: kernel virtual address: ffffc68ae3c55600  type:
> "kernel_config_data"
> WARNING: cannot read kernel_config_data
> crash: seek error: kernel virtual address: ffffc68ae539eb10  type: "possible"
> WARNING: cannot read cpu_possible_map
> crash: seek error: kernel virtual address: ffffc68ae539ebd0  type: "present"
> WARNING: cannot read cpu_present_map
> crash: seek error: kernel virtual address: ffffc68ae539eab0  type: "online"
> WARNING: cannot read cpu_online_map
> crash: seek error: kernel virtual address: ffffc68ae539ec38  type: "active"
> WARNING: cannot read cpu_active_map
> crash: seek error: kernel virtual address: ffffc68ae576d9a8  type:
> "shadow_timekeeper xtime_sec"
> crash: seek error: kernel virtual address: ffffc68ae56ae4e8  type:
> "init_uts_ns"

Also broken in other places
---
WARNING: VA_BITS: calculated: 47  vmcoreinfo: 48
GNU gdb (GDB) 7.6
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "aarch64-unknown-linux-gnu"...

WARNING: kernel relocated [51248248MB]: patching 196641 gdb minimal_symbol values
crash: seek error: kernel virtual address: ffffb0e048795910  type: "kernel_config_data"
WARNING: cannot read kernel_config_data
crash: seek error: kernel virtual address: ffffb0e049eeeb10  type: "possible"
WARNING: cannot read cpu_possible_map
crash: seek error: kernel virtual address: ffffb0e049eeebd0  type: "present"
WARNING: cannot read cpu_present_map
crash: seek error: kernel virtual address: ffffb0e049eeeab0  type: "online"
WARNING: cannot read cpu_online_map
crash: seek error: kernel virtual address: ffffb0e049eeec38  type: "active"
WARNING: cannot read cpu_active_map
crash: seek error: kernel virtual address: ffffb0e04a2bf168  type: "shadow_timekeeper xtime_sec"
crash: seek error: kernel virtual address: ffffb0e04a1ff0a8  type: "init_uts_ns"
crash: /var/tmp/vmlinux-6.4.0-150600.2-default.gz_aeVCl7 and /var/crash/2023-11-17-05-16/vmcore do not match!

Usage:

  crash [OPTION]... NAMELIST MEMORY-IMAGE[@ADDRESS]	(dumpfile form)
  crash [OPTION]... [NAMELIST]             		(live system form)

Enter "crash -h" for details.
---
Comment 3 Jiri Kosina 2023-11-23 13:31:20 UTC
(In reply to Petr Cervinka from comment #0)
> We did standard test scenario for kdump and crash analysis, but analysis on
> 15-SP6 fails with "crash: invalid structure member offset: 
> kmem_cache_s_num"
[ ... ]
> WARNING: kernel relocated [526MB]: patching 217087 gdb minimal_symbol values
> WARNING: kernel version inconsistency between vmlinux and dumpfile

Can you please double-check that the dumpfile is really consistent with the vmlinux?
Comment 4 Michal Hocko 2023-11-24 09:48:58 UTC
Let's add David Mair. AFAIK, he has upgraded crash to the latest version in 15sp6.
Comment 5 Petr Cervinka 2023-11-24 10:11:35 UTC
(In reply to Jiri Kosina from comment #3)

> Can you please double-check that the dumpfile is really consistent with the
> vmlinux?


vmlinux file was not changed during the test.  It is same all the time for dump and follow up analysis.
Comment 6 Petr Cervinka 2023-11-24 10:17:16 UTC
Also, I would recommend to update crash to 8.0.4 to same version as it in Tumbleweed. Version 8.0.4 fixed many issues with crash there (like https://bugzilla.suse.com/show_bug.cgi?id=1190434 which is not just for usr merge but fixed few follow issues).
Comment 7 David Mair 2023-11-24 17:31:30 UTC
I haven't been able to submit for 15-SP6 yet, for several reasons. It's in Factory head built for Tumbleweed. The crash-gcore package probably won't install due to different kernel versions in Tumbleweed and 15-SP6. Otherwise, the Tumbleweed 8.0.4 version is adequate for testing the reported problem.
Comment 8 Radoslav Tzvetkov 2023-11-29 14:10:50 UTC
David, we submitted for you: https://build.suse.de/request/show/313995
Comment 10 Radoslav Tzvetkov 2023-12-01 14:37:34 UTC
Setting it as resolved. One can always reopen if suspects something is not going