Bug 1172670 - kdump: fails to create dump as dump-capture kernel boots into systemd emergency mode
kdump: fails to create dump as dump-capture kernel boots into systemd emergen...
Status: RESOLVED FIXED
: 1171055 (view as bug list)
Classification: openSUSE
Product: openSUSE Tumbleweed
Classification: openSUSE
Component: Kernel
Current
Other Other
: P5 - None : Normal (vote)
: ---
Assigned To: Petr Tesařík
E-mail List
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2020-06-08 14:13 UTC by Shung-Hsi Yu
Modified: 2021-10-13 13:59 UTC (History)
8 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments
journalctl output (colored) (91.97 KB, text/x-log)
2020-06-08 14:13 UTC, Shung-Hsi Yu
Details
rpm -qa (92.48 KB, text/plain)
2020-06-08 14:18 UTC, Shung-Hsi Yu
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Shung-Hsi Yu 2020-06-08 14:13:29 UTC
Created attachment 838585 [details]
journalctl output (colored)

The system is configured to capture core dump on crash with `yast2 kdump`. This is confirmed by looking at /proc/cmdline (contains ' crashkernel=256M,high crashkernel=72M,low'), output of `dmesg`, and the status of kdump.service.

Triggering a crash dump manually with `echo c >/proc/sysrq-trigger` shows the stacktrace, but fails to correctly boot up the dump-capturing kernel.
Below is an excerpt of the output of `journalctl` before the emergency shell was started.

Jun 08 14:13:35 localhost systemd[1]: Starting Switch Root...
Jun 08 14:13:35 localhost systemctl[377]: Failed to switch root: Specified switch root path '/sysroot' does not seem to be an OS tree. os-release file is missing.
Jun 08 14:13:35 localhost systemd[1]: initrd-switch-root.service: Main process exited, code=exited, status=1/FAILURE
Jun 08 14:13:35 localhost systemd[1]: initrd-switch-root.service: Failed with result 'exit-code'.
Jun 08 14:13:35 localhost systemd[1]: Failed to start Switch Root.
Jun 08 14:13:35 localhost systemd[1]: initrd-switch-root.service: Triggering OnFailure= d
ependencies.
Jun 08 14:13:35 localhost systemd[1]: Starting Setup Virtual Console...
Jun 08 14:13:35 localhost systemd[1]: Finished Setup Virtual Console.
Jun 08 14:13:35 localhost systemd[1]: Started Emergency Shell.
Jun 08 14:13:35 localhost systemd[1]: Reached target Emergency Mode.

This issue looks a bit similar to https://bugzilla.redhat.com/show_bug.cgi?id=1812393 but I'm not too sure.
Comment 1 Shung-Hsi Yu 2020-06-08 14:18:04 UTC
Created attachment 838586 [details]
rpm -qa

After updating related packages to the latest version, the issue still occurs. (Attaching the output of `rpm -qa`)

The latest version to possibly related packages are as such:
* dracut-050+suse.63.g796e020e-1.2.x86_64
* kernel-default-5.6.14-1.5.x86_64
* kexec-tools-2.0.20-5.2.x86_64
Comment 2 Jeff Mahoney 2020-06-08 14:19:56 UTC
Once you drop to emergency mode, what does the 'mount' command show?
Comment 3 Shung-Hsi Yu 2020-06-08 14:29:13 UTC
(In reply to Jeff Mahoney from comment #2)
> Once you drop to emergency mode, what does the 'mount' command show?

The output of the `mount` command is:

none on / type rootfs (rw)
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime)
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
devtmpfs on /dev type devtmpfs (rw,nosuid,size=96736k,nr_inodes=24184,mode=755)
securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev)
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000)
tmpfs on /run type tmpfs (rw,nosuid,nodev,mode=755)
tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755)
cgroup2 on /sys/fs/cgroup/unified type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate)
cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,name=systemd)
pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime)
efivarfs on /sys/firmware/efi/efivars type efivarfs (rw,nosuid,nodev,noexec,relatime)
none on /sys/fs/bpf type bpf (rw,nosuid,nodev,noexec,relatime,mode=700)
cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio)
cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset)
cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio)
cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer)
cgroup on /sys/fs/cgroup/rdma type cgroup (rw,nosuid,nodev,noexec,relatime,rdma)
cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices)
cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct)
cgroup on /sys/fs/cgroup/hugetlb type cgroup (rw,nosuid,nodev,noexec,relatime,hugetlb)
cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids)
cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event)
none on /sysroot type rootfs (ro)

I am also able to run `mount -a` to get rootfs mount just fine.
After running `mount -a` I get these entries in addition to those above.

/dev/nvme0n1p2 on /kdump/mnt0 type btrfs (rw,relatime,ssd,space_cache,skip_balance,subvolid=267,subvol=/@/.snapshots/1/snapshot)
/dev/nvme0n1p2 on /sysroot/mnt0 type btrfs (rw,relatime,ssd,space_cache,skip_balance,subvolid=267,subvol=/@/.snapshots/1/snapshot)
/dev/nvme0n1p2 on /kdump/mnt1/var type btrfs (rw,relatime,ssd,space_cache,skip_balance,subvolid=257,subvol=/@/var)
/dev/nvme0n1p2 on /sysroot/mnt1/var type btrfs (rw,relatime,ssd,space_cache,skip_balance,subvolid=257,subvol=/@/var)
Comment 4 Shung-Hsi Yu 2020-07-28 14:54:42 UTC
The issue is still present after upgrading to Tumbleweed 20200720, with the following (possibly) issue-related packages:
* dracut-050+suse.67.g28be2f36-1.1.x86_64
* kernel-default-5.7.9-1.1.x86_64
* kexec-tools-2.0.20-5.2.x86_64
* kdump-0.9.0-13.2.x86_64
* systemd-245.6-2.1.x86_64

After some poking around I found that manual dump succeed with the following commands:
    systemctl start kdump-mnt1-var.mount
    bash -c 'source /lib/kdump/save_dump.sh' # From kdump-save.service

However, manually starting `systemctl start kdump-save.service` will not work, it just brings me back the the rescue mode prompt.

This leads me to think that perhaps the issue is not initrd-switch-root.service, rather, the issue is that kdump-save (or its dependencies) somehow fail to start.
Comment 5 Shung-Hsi Yu 2020-07-28 14:57:40 UTC
kdump-save.service is not listed as a dependency of initrd.target, not sure if this is the intended behavior.

    $ systemctl list-dependencies initrd.target
    initrd.target
    ● ├─dev-disk-by\x2duuid-BEE5\x2dF873.device
    ● ├─dracut-cmdline-ask.service
    ● ├─dracut-cmdline.service
    ● ├─dracut-initqueue.service
    ● ├─dracut-mount.service
    ● ├─dracut-pre-mount.service
    ● ├─dracut-pre-pivot.service
    ● ├─dracut-pre-trigger.service
    ● ├─dracut-pre-udev.service
    ● ├─initrd-parse-etc.service
    ● ├─basic.target
    ● │ ├─tmp.mount
    ● │ ├─paths.target
    ● │ ├─slices.target
    ● │ │ ├─-.slice
    ● │ │ └─system.slice
    ● │ ├─sockets.target
    ● │ │ ├─systemd-journald-dev-log.socket
    ● │ │ ├─systemd-journald.socket
    ● │ │ ├─systemd-udevd-control.socket
    ● │ │ └─systemd-udevd-kernel.socket
    ● │ ├─sysinit.target
    ● │ │ ├─kmod-static-nodes.service
    ● │ │ ├─systemd-ask-password-console.path
    ● │ │ ├─systemd-journald.service
    ● │ │ ├─systemd-modules-load.service
    ● │ │ ├─systemd-sysctl.service
    ● │ │ ├─systemd-tmpfiles-setup-dev.service
    ● │ │ ├─systemd-tmpfiles-setup.service
    ● │ │ ├─systemd-udev-trigger.service
    ● │ │ ├─systemd-udevd.service
    ● │ │ ├─local-fs.target
    ● │ │ │ ├─kdump-mnt0.mount
    ● │ │ │ └─kdump-mnt1-var.mount
    ● │ │ └─swap.target
    ● │ └─timers.target
    ● ├─initrd-fs.target
    ● ├─initrd-root-device.target
    ● └─initrd-root-fs.target
    ●   ├─ostree-prepare-root.service
    ●   └─sysroot.mount
Comment 6 Shung-Hsi Yu 2020-07-28 15:10:05 UTC
(In reply to Shung-Hsi Yu from comment #5)
> kdump-save.service is not listed as a dependency of initrd.target, not sure
> if this is the intended behavior.
> 
>     $ systemctl list-dependencies initrd.target
>     initrd.target
>     ● ├─dev-disk-by\x2duuid-BEE5\x2dF873.device
>     ● ├─dracut-cmdline-ask.service
>     ● ├─dracut-cmdline.service
>     ● ├─dracut-initqueue.service
>     ● ├─dracut-mount.service
>     ● ├─dracut-pre-mount.service
>     ● ├─dracut-pre-pivot.service
>     ● ├─dracut-pre-trigger.service
>     ● ├─dracut-pre-udev.service
>     ● ├─initrd-parse-etc.service
>     ● ├─basic.target
>     ● │ ├─tmp.mount
>     ● │ ├─paths.target
>     ● │ ├─slices.target
>     ● │ │ ├─-.slice
>     ● │ │ └─system.slice
>     ● │ ├─sockets.target
>     ● │ │ ├─systemd-journald-dev-log.socket
>     ● │ │ ├─systemd-journald.socket
>     ● │ │ ├─systemd-udevd-control.socket
>     ● │ │ └─systemd-udevd-kernel.socket
>     ● │ ├─sysinit.target
>     ● │ │ ├─kmod-static-nodes.service
>     ● │ │ ├─systemd-ask-password-console.path
>     ● │ │ ├─systemd-journald.service
>     ● │ │ ├─systemd-modules-load.service
>     ● │ │ ├─systemd-sysctl.service
>     ● │ │ ├─systemd-tmpfiles-setup-dev.service
>     ● │ │ ├─systemd-tmpfiles-setup.service
>     ● │ │ ├─systemd-udev-trigger.service
>     ● │ │ ├─systemd-udevd.service
>     ● │ │ ├─local-fs.target
>     ● │ │ │ ├─kdump-mnt0.mount
>     ● │ │ │ └─kdump-mnt1-var.mount
>     ● │ │ └─swap.target
>     ● │ └─timers.target
>     ● ├─initrd-fs.target
>     ● ├─initrd-root-device.target
>     ● └─initrd-root-fs.target
>     ●   ├─ostree-prepare-root.service
>     ●   └─sysroot.mount

According to /usr/lib/dracut/modules.d/99kdump/module-setup.sh:277 there should be a $systemdsystemunitdir"/initrd.target.wants/kdump-save.service symbolic link to the kdump-save.service, however I don't see such symbolic link in my kdump initrd.
Comment 7 Shung-Hsi Yu 2020-07-28 15:37:14 UTC
(In reply to Shung-Hsi Yu from comment #6)
> (In reply to Shung-Hsi Yu from comment #5)
> > kdump-save.service is not listed as a dependency of initrd.target, not sure
> > if this is the intended behavior.
> > 
> >     $ systemctl list-dependencies initrd.target
> >     initrd.target
> >     ● ├─dev-disk-by\x2duuid-BEE5\x2dF873.device
> >     ● ├─dracut-cmdline-ask.service
> >     [...snip...]
> 
> According to /usr/lib/dracut/modules.d/99kdump/module-setup.sh:277 there
> should be a $systemdsystemunitdir"/initrd.target.wants/kdump-save.service
> symbolic link to the kdump-save.service, however I don't see such symbolic
> link in my kdump initrd.

Adding a `mkdir -p "$initdir/$systemdsystemunitdir/initrd.target.wants"` in /usr/lib/dracut/modules.d/99kdump/module-setup.sh does the magic, kdump now works.
Comment 8 Jiri Slaby 2020-08-28 06:18:48 UTC
Petr, could you review/merge
https://github.com/openSUSE/kdump/pull/14
and fix the package?
Comment 9 Jiri Slaby 2020-08-28 07:55:59 UTC
The pull request fixes it for me too.
Comment 10 Fabian Vogt 2020-09-01 06:32:37 UTC
*** Bug 1171055 has been marked as a duplicate of this bug. ***
Comment 11 Fabian Vogt 2020-09-11 07:18:13 UTC
AFAICT the fix got finally merged, but there is no submission yet?
Comment 12 Fabian Vogt 2020-09-21 11:53:49 UTC
(In reply to Fabian Vogt from comment #11)
> AFAICT the fix got finally merged, but there is no submission yet?

Ping.

Removing IN_PROGRESS as there's no progress...
Comment 13 Jiri Slaby 2020-09-22 06:33:50 UTC
https://build.opensuse.org/request/show/835986
Comment 14 Jiri Slaby 2020-09-29 05:54:21 UTC
(In reply to Jiri Slaby from comment #13)
> https://build.opensuse.org/request/show/835986

Self-accepted after a week of inactivity. FWDed to factory too.
Comment 22 Swamp Workflow Management 2021-10-06 19:39:56 UTC
SUSE-RU-2021:3304-1: An update that has four recommended fixes can now be installed.

Category: recommended (moderate)
Bug References: 1172670,1183070,1184616,1186037
CVE References: 
JIRA References: 
Sources used:
SUSE MicroOS 5.1 (src):    kdump-0.9.0-18.3.1
SUSE Linux Enterprise Module for Basesystem 15-SP3 (src):    kdump-0.9.0-18.3.1

NOTE: This line indicates an update has been released for the listed product(s). At times this might be only a partial fix. If you have questions please reach out to maintenance coordination.
Comment 23 Swamp Workflow Management 2021-10-06 19:50:06 UTC
openSUSE-RU-2021:3304-1: An update that has four recommended fixes can now be installed.

Category: recommended (moderate)
Bug References: 1172670,1183070,1184616,1186037
CVE References: 
JIRA References: 
Sources used:
openSUSE Leap 15.3 (src):    kdump-0.9.0-18.3.1
Comment 24 Swamp Workflow Management 2021-10-06 20:00:14 UTC
SUSE-RU-2021:3303-1: An update that has 6 recommended fixes can now be installed.

Category: recommended (moderate)
Bug References: 1172670,1182309,1183070,1184616,1186037,1188090
CVE References: 
JIRA References: 
Sources used:
SUSE MicroOS 5.0 (src):    kdump-0.9.0-11.6.1
SUSE Linux Enterprise Module for Basesystem 15-SP2 (src):    kdump-0.9.0-11.6.1

NOTE: This line indicates an update has been released for the listed product(s). At times this might be only a partial fix. If you have questions please reach out to maintenance coordination.
Comment 25 Swamp Workflow Management 2021-10-11 19:28:29 UTC
openSUSE-RU-2021:1349-1: An update that has 6 recommended fixes can now be installed.

Category: recommended (moderate)
Bug References: 1172670,1182309,1183070,1184616,1186037,1188090
CVE References: 
JIRA References: 
Sources used:
openSUSE Leap 15.2 (src):    kdump-0.9.0-lp152.7.6.1
Comment 26 Swamp Workflow Management 2021-10-12 13:29:00 UTC
SUSE-RU-2021:3340-1: An update that has 8 recommended fixes can now be installed.

Category: recommended (moderate)
Bug References: 1154837,1164713,1172670,1182309,1183070,1184616,1186037,1188090
CVE References: 
JIRA References: 
Sources used:
SUSE Linux Enterprise Server 12-SP5 (src):    kdump-0.8.16-11.13.1

NOTE: This line indicates an update has been released for the listed product(s). At times this might be only a partial fix. If you have questions please reach out to maintenance coordination.
Comment 27 Swamp Workflow Management 2021-10-13 13:20:29 UTC
SUSE-RU-2021:3404-1: An update that has 8 recommended fixes can now be installed.

Category: recommended (moderate)
Bug References: 1154837,1164713,1172670,1182309,1183070,1184616,1186037,1188090
CVE References: 
JIRA References: 
Sources used:
SUSE Linux Enterprise Server for SAP 15-SP1 (src):    kdump-0.9.0-4.9.1
SUSE Linux Enterprise Server 15-SP1-LTSS (src):    kdump-0.9.0-4.9.1
SUSE Linux Enterprise Server 15-SP1-BCL (src):    kdump-0.9.0-4.9.1
SUSE Linux Enterprise High Performance Computing 15-SP1-LTSS (src):    kdump-0.9.0-4.9.1
SUSE Linux Enterprise High Performance Computing 15-SP1-ESPOS (src):    kdump-0.9.0-4.9.1
SUSE Enterprise Storage 6 (src):    kdump-0.9.0-4.9.1
SUSE CaaS Platform 4.0 (src):    kdump-0.9.0-4.9.1

NOTE: This line indicates an update has been released for the listed product(s). At times this might be only a partial fix. If you have questions please reach out to maintenance coordination.
Comment 28 Swamp Workflow Management 2021-10-13 13:59:57 UTC
SUSE-RU-2021:3405-1: An update that has 11 recommended fixes can now be installed.

Category: recommended (moderate)
Bug References: 1101149,1102252,1125011,1133407,1154837,1164713,1172670,1182309,1183070,1184616,1186037
CVE References: 
JIRA References: 
Sources used:
SUSE Linux Enterprise Server for SAP 15 (src):    kdump-0.8.16-14.6.1
SUSE Linux Enterprise Server 15-LTSS (src):    kdump-0.8.16-14.6.1
SUSE Linux Enterprise High Performance Computing 15-LTSS (src):    kdump-0.8.16-14.6.1
SUSE Linux Enterprise High Performance Computing 15-ESPOS (src):    kdump-0.8.16-14.6.1

NOTE: This line indicates an update has been released for the listed product(s). At times this might be only a partial fix. If you have questions please reach out to maintenance coordination.