Bugzilla – Bug 1197309
[Build 20220318] systemd 250: failed to parse fstab: permission denied
Last modified: 2023-10-03 14:10:27 UTC
## Observation openQA test in scenario microos-Tumbleweed-MicroOS-Image-x86_64-microos@64bit fails in [trup_smoke](https://openqa.opensuse.org/tests/2252516/modules/trup_smoke/steps/28) ## Test suite description ## Reproducible Fails since (at least) Build [20220318](https://openqa.opensuse.org/tests/2252516) (current job) ## Expected result Last good: [20220317](https://openqa.opensuse.org/tests/2249910) (or more recent) ## Further details Always latest result in this scenario: [latest](https://openqa.opensuse.org/tests/latest?arch=x86_64&distri=microos&flavor=MicroOS-Image&machine=64bit&test=microos&version=Tumbleweed)
It works locally for me and fbui as well as in openQA: https://openqa.opensuse.org/tests/2257035 Even in the linked failed test it successfully boots into a new snapshot during the cockpit test. In the journal from the test I see these messages: Mar 19 07:10:14.821930 localhost.localdomain kernel: audit: type=1400 audit(1647673814.816:12): avc: denied { read } for pid=1091 comm="systemd-sysctl" name="protected_hardlinks" dev="proc" ino=14005 scontext=system_u:system_r:systemd_sysctl_t:s0 tcontext=system_u:object_r:proc_security_t:s0 tclass=file permissive=0 Mar 19 07:10:14.827159 localhost.localdomain systemd[1]: Finished Apply Kernel Variables. Mar 19 07:10:14.829928 localhost.localdomain kernel: audit: type=1400 audit(1647673814.824:13): avc: denied { read } for pid=1091 comm="systemd-sysctl" name="protected_hardlinks" dev="proc" ino=14005 scontext=system_u:system_r:systemd_sysctl_t:s0 tcontext=system_u:object_r:proc_security_t:s0 tclass=file permissive=0 Mar 19 07:10:14.829972 localhost.localdomain kernel: audit: type=1400 audit(1647673814.824:14): avc: denied { read } for pid=1091 comm="systemd-sysctl" name="protected_symlinks" dev="proc" ino=14006 scontext=system_u:system_r:systemd_sysctl_t:s0 tcontext=system_u:object_r:proc_security_t:s0 tclass=file permissive=0 Those don't appear locally here, which indicates that there are some behaviour differences, could be random and induced by some race conditions...
The sequence in https://openqa.opensuse.org/tests/2252516/file/journal_check-full_journal.log looks quite strange. Apparently SELINUX gets enable at switch root, just before systemd is restarted in the real root: --> Mar 19 07:10:13.382735 localhost.localdomain systemd[1]: Switching root. Mar 19 07:10:13.432872 localhost.localdomain systemd-journald[172]: Journal stopped Mar 19 07:10:14.566701 localhost.localdomain systemd-journald[172]: Received SIGTERM from PID 1 (systemd). Mar 19 07:10:14.566754 localhost.localdomain kernel: audit: type=1400 audit(1647673813.468:5): avc: denied { prog_run } for pid=1 comm="systemd" scontext=system_u:system_r:init_t:s0 tcont ext=system_u:system_r:kernel_t:s0 tclass=bpf permissive=0 Mar 19 07:10:14.566781 localhost.localdomain kernel: audit: type=1400 audit(1647673813.468:6): avc: denied { prog_run } for pid=1 comm="systemd" scontext=system_u:system_r:init_t:s0 tcont ext=system_u:system_r:kernel_t:s0 tclass=bpf permissive=0 Mar 19 07:10:14.566796 localhost.localdomain kernel: audit: type=1400 audit(1647673813.468:7): avc: denied { prog_run } for pid=1 comm="systemd" scontext=system_u:system_r:init_t:s0 tcontext=system_u:system_r:kernel_t:s0 tclass=bpf permissive=0 Mar 19 07:10:14.566810 localhost.localdomain kernel: audit: type=1400 audit(1647673813.468:8): avc: denied { prog_run } for pid=1 comm="systemd" scontext=system_u:system_r:init_t:s0 tcontext=system_u:system_r:kernel_t:s0 tclass=bpf permissive=0 Mar 19 07:10:14.566823 localhost.localdomain kernel: audit: type=1400 audit(1647673813.468:9): avc: denied { prog_run } for pid=1 comm="systemd" scontext=system_u:system_r:init_t:s0 tcontext=system_u:system_r:kernel_t:s0 tclass=bpf permissive=0 Mar 19 07:10:14.566837 localhost.localdomain kernel: SELinux: Class mctp_socket not defined in policy. Mar 19 07:10:14.566854 localhost.localdomain kernel: SELinux: Class io_uring not defined in policy. Mar 19 07:10:14.566867 localhost.localdomain kernel: SELinux: the above unknown classes and permissions will be allowed Mar 19 07:10:14.566881 localhost.localdomain kernel: SELinux: Converting 204 SID table entries... Mar 19 07:10:14.566893 localhost.localdomain kernel: SELinux: policy capability network_peer_controls=1 Mar 19 07:10:14.566907 localhost.localdomain kernel: SELinux: policy capability open_perms=1 Mar 19 07:10:14.566920 localhost.localdomain kernel: SELinux: policy capability extended_socket_class=1 Mar 19 07:10:14.566933 localhost.localdomain kernel: SELinux: policy capability always_check_network=0 Mar 19 07:10:14.566950 localhost.localdomain kernel: SELinux: policy capability cgroup_seclabel=1 Mar 19 07:10:14.566964 localhost.localdomain kernel: SELinux: policy capability nnp_nosuid_transition=1 Mar 19 07:10:14.566977 localhost.localdomain kernel: SELinux: policy capability genfs_seclabel_symlinks=0 Mar 19 07:10:14.566991 localhost.localdomain kernel: audit: type=1403 audit(1647673813.580:10): auid=4294967295 ses=4294967295 lsm=selinux res=1 Mar 19 07:10:14.567004 localhost.localdomain kernel: audit: type=1400 audit(1647673813.584:11): avc: denied { setcurrent } for pid=1 comm="systemd" scontext=system_u:system_r:init_t:s0 tcontext=system_u:system_r:init_t:s0 tclass=process permissive=0 Mar 19 07:10:14.567019 localhost.localdomain systemd[1]: Failed to transition into init label 'system_u:system_r:init_t:s0', ignoring. Mar 19 07:10:14.567039 localhost.localdomain systemd[1]: Successfully loaded SELinux policy in 94.915ms. Mar 19 07:10:14.567054 localhost.localdomain systemd[1]: Relabelled /dev, /dev/shm, /run, /sys/fs/cgroup in 65.251ms. Mar 19 07:10:14.567069 localhost.localdomain systemd[1]: systemd 250.4+suse.35.g8ef8dfd540 running in system mode (+PAM +AUDIT +SELINUX +APPARMOR -IMA -SMACK +SECCOMP +GCRYPT +GNUTLS +OPENSSL +ACL +BLKID +CURL +ELFUTILS +FIDO2 +IDN2 -IDN +IPTC +KMOD +LIBCRYPTSETUP +LIBFDISK +PCRE2 +PWQUALITY +P11KIT +QRENCODE +BZIP2 +LZ4 +XZ +ZLIB +ZSTD -BPF_FRAMEWORK -XKBCOMMON +UTMP +SYSVINIT default-hierarchy=unified) --< What is starting SELinux? Is that a userspace daemon or a kernel driver? Is SELinux supposed to be initialized in the initrd or afterwards?
https://github.com/kubic-project/microos-tools/tree/master/selinux/98selinux-microos relabels /sysroot from within the initrd in the dracut-pre-pivot hook. Note that on MicroOS, /etc is an overlay with /var/lib/overlay/X/etc as upperdir, so /sysroot/var and /sysroot/etc are mounted in the initrd (x-initrd.mount). The issue is that those aren't mounted yet when the pre-pivot hook runs, but they are mounted while the relabel is in progress! Mar 19 07:10:05.230048 localhost.localdomain systemd[1]: Finished Reload Configuration from the Real Root. Mar 19 07:10:05.233033 localhost.localdomain systemd[1]: dracut mount hook was skipped because all trigger condition checks failed. Mar 19 07:10:05.234886 localhost.localdomain systemd[1]: Starting dracut pre-pivot and cleanup hook... Mar 19 07:10:05.253660 localhost.localdomain dracut-pre-pivot[965]: SELinux: relabeling root filesystem ... Mar 19 07:10:05.383675 localhost.localdomain dracut-pre-pivot[965]: SELinux: mount root read-write and relabel Mar 19 07:10:05.403681 localhost.localdomain kernel: BTRFS info (device vda3): using free space tree Mar 19 07:10:05.407806 localhost.localdomain dracut-pre-pivot[983]: Can't stat exclude path "/var/lib/overlay", No such file or directory - ignoring. Mar 19 07:10:05.790973 localhost.localdomain systemd[1]: Mounting /sysroot/root... Mar 19 07:10:05.793760 localhost.localdomain systemd[1]: sysroot-var.mount: Directory /sysroot/var to mount over is not empty, mounting anyway. Mar 19 07:10:05.798267 localhost.localdomain systemd[1]: Mounting /sysroot/var... Mar 19 07:10:05.807194 localhost.localdomain systemd[1]: Mounted /sysroot/root. Mar 19 07:10:05.818591 localhost.localdomain systemd[1]: Mounted /sysroot/var. Mar 19 07:10:05.820518 localhost.localdomain systemd[1]: sysroot-etc.mount: Directory /sysroot/etc to mount over is not empty, mounting anyway. Mar 19 07:10:05.822020 localhost.localdomain systemd[1]: Mounting /sysroot/etc... Mar 19 07:10:05.828763 localhost.localdomain systemd[1]: Starting Grow File System on /sysroot/var... Mar 19 07:10:05.857164 localhost.localdomain systemd[1]: Reached target Initrd File Systems. ... Mar 19 07:10:13.180623 localhost.localdomain systemd[1]: sysroot-proc.mount: Deactivated successfully. Mar 19 07:10:13.194281 localhost.localdomain systemd[1]: Finished dracut pre-pivot and cleanup hook. This apparently means that "/var/lib/overlay" from the exclusion list is ignored and it relabels /sysroot/etc/ recursively and then also relabels /sysroot/var/lib/overlay, the upperdir of the /etc overlay. Due to caching effects of overlayfs, /sysroot/etc/ still shows the correct labels, even though the upperdir has broken ones. They become visible after remounting the overlay or simply on the next boot. I could reproduce that locally by running this in the dracut shell (rd.break) of a healthy system: bash-5.1# ls -Z /sysroot/var/lib/overlay/*/etc/fstab system_u:object_r:etc_t:s0 /sysroot/var/lib/overlay/2/etc/fstab bash-5.1# ls -Z /sysroot/etc/fstab system_u:object_r:etc_t:s0 /sysroot/etc/fstab bash-5.1# systemctl stop /sysroot/var; (. /usr/lib/dracut-lib.sh; . /usr/lib/dracut/hooks/pre-pivot/50-selinux-microos-relabel.sh) & sleep 2; systemctl start /sysroot/etc [1] 1310 SELinux: relabeling root filesystem SELinux: loading policy SELinux: mount root read-write and relabel Can't stat exclude path "/var/lib/overlay", No such file or directory - ignoring. bash-5.1# ls -Z /sysroot/etc/fstab system_u:object_r:etc_t:s0 /sysroot/etc/fstab [1]+ Done ( . /usr/lib/dracut-lib.sh; . /usr/lib/dracut/hooks/pre-pivot/50-selinux-microos-relabel.sh ) bash-5.1# ls -Z /sysroot/var/lib/overlay/*/etc/fstab system_u:object_r:var_lib_t:s0 /sysroot/var/lib/overlay/2/etc/fstab bash-5.1# mount -o remount /sysroot/etc bash-5.1# ls -Z /sysroot/etc/fstab system_u:object_r:var_lib_t:s0 /sysroot/etc/fstab As it changes labels in /var/lib/overlay, it'll also break booting from old snapshots (!) The question is why dracut-pre-pivot.service starts before the /sysroot/{var,etc} mounts are up and initrd-fs.target reached. bash-5.1# systemctl show dracut-pre-pivot.service -P After --no-pager dracut-initqueue.service system.slice initrd.target dracut-mount.service sysroot.mount dracut-pre-mount.service systemd-journald.socket initrd-parse-etc.service remote-fs.target bash-5.1# systemctl show initrd.target -P After --no-pager initrd-root-fs.target initrd-root-device.target initrd-usr-fs.target "dev-disk-by\\x2duuid-68348daf\\x2dc38c\\x2d4ffc\\x2dbd9b\\x2da0bdb9ac225d.device" ignition-complete.target basic.target rescue.target ignition-remove-reconfig_system.service rescue.service initrd-fs.target bash-5.1# systemctl show initrd-fs.target -P After --no-pager initrd-parse-etc.service sysroot-etc.mount sysroot-root.mount systemd-growfs@sysroot-var.service sysroot-var.mount
Created attachment 857254 [details] systemd debug log I wrote a combustion script which adds a debug script to the pre-pivot hook on boot. With that, it's visible that initrd-fs.target isn't active yet when dracut-pre-pivot.service runs. The full systemd debug log is attached. Which of the units are active is random, in this particular run sysroot-etc.mount was fast enough and is shown as active.
(In reply to Fabian Vogt from comment #4) > Created attachment 857254 [details] > systemd debug log > > I wrote a combustion script which adds a debug script to the pre-pivot hook > on boot. With that, it's visible that initrd-fs.target isn't active yet when > dracut-pre-pivot.service runs. The full systemd debug log is attached. Which > of the units are active is random, in this particular run sysroot-etc.mount > was fast enough and is shown as active. There is probably a dependency missing. On my tumbleweed machine I can see that dracut-pre-pivot.service has no dependency to sysroot-etc.mount, only to sysroot.mount: --> # cat /usr/lib/systemd/system/dracut-pre-pivot.service # This file is part of dracut. # # See dracut.bootup(7) for details [Unit] Description=dracut pre-pivot and cleanup hook Documentation=man:dracut-pre-pivot.service(8) DefaultDependencies=no After=initrd.target initrd-parse-etc.service sysroot.mount After=dracut-initqueue.service dracut-pre-mount.service dracut-mount.service Before=initrd-cleanup.service Wants=remote-fs.target After=remote-fs.target --< Fabian, can you add such a dep and retry?
I found these lines in the log: ignition-mount.service: Trying to enqueue job ignition-mount.service/stop/replace ignition-complete.target: Installed new job ignition-complete.target/stop as 88 initrd.target: Job 1 initrd.target/start finished, result=canceled It looks like the fix/workaround for boo#1176746 is the cause. The added "systemctl stop ignition-mount.serivce" indirectly stops initrd.target, which causes systemd to ignore ordering relative to that as it's no longer part of the transaction. So we need to find a different solution to the bug, e.g.: a) Splitting ignition-mount.service's ExecStop into a separate ignition-umount.service b) Disabling ignition-mount's ExecStop in a dropin and run that from combustion c) Some other way... (In reply to Thomas Blume from comment #5) > (In reply to Fabian Vogt from comment #4) > > Created attachment 857254 [details] > > systemd debug log > > > > I wrote a combustion script which adds a debug script to the pre-pivot hook > > on boot. With that, it's visible that initrd-fs.target isn't active yet when > > dracut-pre-pivot.service runs. The full systemd debug log is attached. Which > > of the units are active is random, in this particular run sysroot-etc.mount > > was fast enough and is shown as active. > > There is probably a dependency missing. > On my tumbleweed machine I can see that dracut-pre-pivot.service has no > dependency to sysroot-etc.mount, only to sysroot.mount: That's expected, you'll only see that if you have an /etc mountpoint with x-initrd.mount in /etc/fstab. It'll then get pulled in through initrd-fs.target -> initrd.target. > --> > # cat /usr/lib/systemd/system/dracut-pre-pivot.service > # This file is part of dracut. > # > # See dracut.bootup(7) for details > > [Unit] > Description=dracut pre-pivot and cleanup hook > Documentation=man:dracut-pre-pivot.service(8) > DefaultDependencies=no > After=initrd.target initrd-parse-etc.service sysroot.mount > After=dracut-initqueue.service dracut-pre-mount.service dracut-mount.service > Before=initrd-cleanup.service > Wants=remote-fs.target > After=remote-fs.target > --< > > Fabian, can you add such a dep and retry?
(In reply to Fabian Vogt from comment #6) > I found these lines in the log: > > ignition-mount.service: Trying to enqueue job > ignition-mount.service/stop/replace > ignition-complete.target: Installed new job ignition-complete.target/stop as > 88 > initrd.target: Job 1 initrd.target/start finished, result=canceled > > It looks like the fix/workaround for boo#1176746 is the cause. The added > "systemctl stop ignition-mount.serivce" indirectly stops initrd.target, which > causes systemd to ignore ordering relative to that as it's no longer part of > the > transaction. > > So we need to find a different solution to the bug, e.g.: > > a) Splitting ignition-mount.service's ExecStop into a separate > ignition-umount.service > b) Disabling ignition-mount's ExecStop in a dropin and run that from > combustion For now we decided to go with b): https://build.opensuse.org/request/show/964654 Ideally we find some better solution for compatibility with ignition upstream. I'll leave the bug open for that. > c) Some other way... > > (In reply to Thomas Blume from comment #5) > > (In reply to Fabian Vogt from comment #4) > > > Created attachment 857254 [details] > > > systemd debug log > > > > > > I wrote a combustion script which adds a debug script to the pre-pivot hook > > > on boot. With that, it's visible that initrd-fs.target isn't active yet when > > > dracut-pre-pivot.service runs. The full systemd debug log is attached. Which > > > of the units are active is random, in this particular run sysroot-etc.mount > > > was fast enough and is shown as active. > > > > There is probably a dependency missing. > > On my tumbleweed machine I can see that dracut-pre-pivot.service has no > > dependency to sysroot-etc.mount, only to sysroot.mount: > > That's expected, you'll only see that if you have an /etc mountpoint with > x-initrd.mount in /etc/fstab. It'll then get pulled in through > initrd-fs.target -> initrd.target. > > > --> > > # cat /usr/lib/systemd/system/dracut-pre-pivot.service > > # This file is part of dracut. > > # > > # See dracut.bootup(7) for details > > > > [Unit] > > Description=dracut pre-pivot and cleanup hook > > Documentation=man:dracut-pre-pivot.service(8) > > DefaultDependencies=no > > After=initrd.target initrd-parse-etc.service sysroot.mount > > After=dracut-initqueue.service dracut-pre-mount.service dracut-mount.service > > Before=initrd-cleanup.service > > Wants=remote-fs.target > > After=remote-fs.target > > --< > > > > Fabian, can you add such a dep and retry?