Bug 1215351

Summary: [Build 4.2.5] systemd-udevd reports unknown groups on firstboot
Product: [openSUSE] PUBLIC SUSE Linux Enterprise Server 15 SP5 Reporter: Martin Loviska <mloviska>
Component: systemdAssignee: systemd maintainers <systemd-maintainers>
Status: RESOLVED FIXED QA Contact:
Severity: Normal    
Priority: P1 - Urgent CC: antonio.feijoo, dcassany, fbui, felix.niederwanger, jalausuch, jeos-internal, jsrain, kukuk, lubos.kocman, marcus.schaefer, mloviska, qe-virt, wchen, xlai
Version: unspecifiedFlags: fbui: SHIP_STOPPER?
Target Milestone: ---   
Hardware: Other   
OS: Other   
URL: https://openqa.suse.de/tests/12096040/modules/journal_check/steps/3
Whiteboard:
Found By: openQA Services Priority:
Business Priority: Blocker: Yes
Marketing QA Status: --- IT Deployment: ---
Attachments: journal

Description Martin Loviska 2023-09-14 14:07:14 UTC
## Observation

openQA test in scenario sle-15-SP5-JeOS-for-kvm-and-xen-QR-x86_64-jeos-main@uefi-virtio-vga fails in
[journal_check](https://openqa.suse.de/tests/12096040/modules/journal_check/steps/3)

Affected images: 
* sle-micro 5.5
* minimal-vm 15-SP5
* minimal-vm 15-SP6

Although that systemd-sysusers.service runs fine and system-group-hardware package is present on the system, we can see error below messages in system logs.

```
Sep 13 12:13:23 localhost systemd-tmpfiles[199]: /usr/lib/tmpfiles.d/systemd.conf:11: Failed to resolve group 'utmp'.
Sep 13 12:13:24 localhost systemd-udevd[297]: /usr/lib/udev/rules.d/50-udev-default.rules:18 Unknown group 'tty', ignoring
Sep 13 12:13:24 localhost systemd-udevd[297]: /usr/lib/udev/rules.d/50-udev-default.rules:27 Unknown group 'kmem', ignoring
Sep 13 12:13:24 localhost systemd-udevd[297]: /usr/lib/udev/rules.d/50-udev-default.rules:29 Unknown group 'input', ignoring
Sep 13 12:13:24 localhost systemd-udevd[297]: /usr/lib/udev/rules.d/50-udev-default.rules:32 Unknown group 'video', ignoring
Sep 13 12:13:24 localhost systemd-udevd[297]: /usr/lib/udev/rules.d/50-udev-default.rules:40 Unknown group 'render', ignoring
Sep 13 12:13:24 localhost systemd-udevd[297]: /usr/lib/udev/rules.d/50-udev-default.rules:42 Unknown group 'sgx', ignoring
Sep 13 12:13:24 localhost systemd-udevd[297]: /usr/lib/udev/rules.d/50-udev-default.rules:48 Unknown group 'audio', ignoring
Sep 13 12:13:24 localhost systemd-udevd[297]: /usr/lib/udev/rules.d/50-udev-default.rules:67 Unknown group 'lp', ignoring
Sep 13 12:13:24 localhost systemd-udevd[297]: /usr/lib/udev/rules.d/50-udev-default.rules:69 Unknown group 'disk', ignoring
```

```
localhost:~ # journalctl -u systemd-udevd.service
Sep 13 12:13:24 localhost systemd[1]: Starting Rule-based Manager for Device Events and Files...
Sep 13 12:13:24 localhost systemd-udevd[297]: Network interface NamePolicy= disabled by default.
Sep 13 12:13:24 localhost systemd-udevd[297]: /usr/lib/udev/rules.d/50-udev-default.rules:18 Unknown group 'tty', ignoring
```

```
localhost:~ # journalctl -u systemd-sysusers.service
Sep 13 12:13:30 localhost systemd[1]: Starting Create System Users...
Sep 13 12:13:30 localhost systemd-sysusers[492]: Creating group kmem with gid 488.
Sep 13 12:13:30 localhost systemd-sysusers[492]: Creating group lock with gid 487.
Sep 13 12:13:30 localhost systemd-sysusers[492]: Creating group tty with gid 5.
```

```
localhost:~ # cat /usr/lib/sysusers.d/system-group-hardware.conf
# Type Name ID GECOS [HOME]
# Access to certain kernel and userspace facilities
g kmem    -     -
g lock    -     -
g tty     5     -
g utmp    -     -
# Hardware access groups
g audio   -     -
g cdrom   -     -
g dialout -     -
g disk    -     -
g input   -     -
g lp      -     -
g render  -     -
g sgx     -     -
g tape    -     -
g video   -     -
```

Packages:
system-group-hardware-20170617-150400.22.33.noarch
udev-249.16-150400.8.33.1.x86_64
systemd-249.16-150400.8.33.1.x86_64



## Test suite description



## Reproducible

Fails since (at least) Build [4.2.1](https://openqa.suse.de/tests/11936037)


## Expected result

Last good: (unknown) (or more recent)


## Further details

Always latest result in this scenario: [latest](https://openqa.suse.de/tests/latest?arch=x86_64&distri=sle&flavor=JeOS-for-kvm-and-xen-QR&machine=uefi-virtio-vga&test=jeos-main&version=15-SP5)
Comment 1 Martin Loviska 2023-09-14 14:07:47 UTC
Created attachment 869513 [details]
journal
Comment 2 Jiri Srain 2023-09-14 14:49:27 UTC
Based on my tests (Micro 5.5 snapshot), the reason seems to be that

ConditionNeedsUpdate=/etc

in systemd-sysusers.service is never met. When I manually called 

systemd-sysusers

the missing groups were created on my system.

Systemd team: This seems blocking the RC release of SLE Micro 5.5, many openQA tests are failing because of this. Could you, plese, look into it with the highest possible priority?

Thank you very much!
Comment 3 Franck Bui 2023-09-15 06:38:17 UTC
Apparently the logs tell that the problem is occuring inside initrd.

I tried to reproduce the issue by downloading https://openqa.suse.de/tests/11936037/asset/hdd/SLES15-SP5-Minimal-VM.x86_64-kvm-and-xen-Build4.2.1.qcow2 but the first boot or any late ones don't show the problem.
Comment 4 Franck Bui 2023-09-15 06:43:29 UTC
The problem is likely related to the content of /etc/group inside the initramfs, which is probably empty or not complete.

Martin, do you have a chance to check that ? You simply need to break inside initrd by appending "rd.break" to boot options list and once the system stops at the end of the initrd process, check the content of /etc/group.
Comment 5 Jiri Srain 2023-09-15 06:45:56 UTC
Could you, pls, try the SLE Micro images? Yesterday I reproduced that with the SelfInstall ISO of SLE Micro snapshot:

https://dist.suse.de/ibs/SUSE:/SLE-15-SP5:/Update:/Products:/Micro55:/TEST/images/iso/SLE-Micro.x86_64-5.5.0-Default-SelfInstall-Build3.12.install.iso

Attempts to restart the service (in running system) report that ConditionNeedsUpdate=/etc was not met no matter which changes in /etc I tried (but I may not understand that correctly). I'm not sure it is supposed to work either.

I can confirm that /etc/group is missing the groups in the running system too.

If there is anything I can grab from my system to help debugging this, please, tell me.
Comment 6 Franck Bui 2023-09-15 06:46:49 UTC
Antonio, I'm not sure how /etc/group is created inside initrd.

There's a dracut module "01systemd-sysusers" that seems to take care of embedding systemd-sysusers but apparently this module is never used on any SLE systems I have.

Could you shed some light here on the way /etc/group is initialized in initrd ?

Thanks.
Comment 7 Franck Bui 2023-09-15 06:55:06 UTC
(In reply to Jiri Srain from comment #5)
> Could you, pls, try the SLE Micro images? Yesterday I reproduced that with
> the SelfInstall ISO of SLE Micro snapshot:
> 
> https://dist.suse.de/ibs/SUSE:/SLE-15-SP5:/Update:/Products:/Micro55:/TEST/
> images/iso/SLE-Micro.x86_64-5.5.0-Default-SelfInstall-Build3.12.install.iso

Download in progress...

> Attempts to restart the service (in running system) report that
> ConditionNeedsUpdate=/etc was not met no matter which changes in /etc I
> tried (but I may not understand that correctly). I'm not sure it is supposed
> to work either.

If you tried to restart systemd-sysusers.service in the host, that's expected. This service is running only if something changed in /usr, ie when the content of /usr/lib/sysusers.d has a chance to have been updated.
Comment 8 Antonio Feijoo 2023-09-15 06:57:15 UTC
(In reply to Franck Bui from comment #6)
> Antonio, I'm not sure how /etc/group is created inside initrd.
> 
> There's a dracut module "01systemd-sysusers" that seems to take care of
> embedding systemd-sysusers but apparently this module is never used on any
> SLE systems I have.
> 
> Could you shed some light here on the way /etc/group is initialized in
> initrd ?

On the systemd [1] and udev-rules [2] modules. The systemd-sysusers module is only pulled in as a dependency by a few modules which are not usually included by default. There was an attempt to create sysusers at build time [3], but it was not merged.

[1] https://github.com/dracutdevs/dracut/blob/6acfecae572fb457115b276b5b64d9424ad5187b/modules.d/00systemd/module-setup.sh#L203-L211

[2] https://github.com/dracutdevs/dracut/blob/6acfecae572fb457115b276b5b64d9424ad5187b/modules.d/95udev-rules/module-setup.sh#L51-L64

[3] https://github.com/dracutdevs/dracut/pull/2067
Comment 9 Jiri Srain 2023-09-15 06:57:48 UTC
(In reply to Franck Bui from comment #7)
> (In reply to Jiri Srain from comment #5)

> > Attempts to restart the service (in running system) report that
> > ConditionNeedsUpdate=/etc was not met no matter which changes in /etc I
> > tried (but I may not understand that correctly). I'm not sure it is supposed
> > to work either.
> 
> If you tried to restart systemd-sysusers.service in the host, that's
> expected. This service is running only if something changed in /usr, ie when
> the content of /usr/lib/sysusers.d has a chance to have been updated.

Thanks for clarification. The condition in the .service file leading to checking for changes in /etc is then a bit confusing.
Comment 10 Franck Bui 2023-09-15 07:12:50 UTC
(In reply to Jiri Srain from comment #9)
> Thanks for clarification. The condition in the .service file leading to
> checking for changes in /etc is then a bit confusing.

"ConditionNeedsUpdate=/etc", basically means: check that the content of /usr is newer than the content of /etc. If it's the case ConditionNeedsUpdate is true.
Comment 11 Franck Bui 2023-09-15 07:18:09 UTC
(In reply to Antonio Feijoo from comment #8)
> (In reply to Franck Bui from comment #6)
> > Could you shed some light here on the way /etc/group is initialized in
> > initrd ?
> 
> On the systemd [1] and udev-rules [2] modules.

Hmm some of the "hardware" groups are missing there such as "tty", "kmem", etc...
Comment 12 Jiri Srain 2023-09-15 07:23:10 UTC
(In reply to Franck Bui from comment #11)
> (In reply to Antonio Feijoo from comment #8)
> > (In reply to Franck Bui from comment #6)
> > > Could you shed some light here on the way /etc/group is initialized in
> > > initrd ?
> > 
> > On the systemd [1] and udev-rules [2] modules.
> 
> Hmm some of the "hardware" groups are missing there such as "tty", "kmem",
> etc...

Yes, this is exactly what we see in Micro snapshot.
Comment 13 Antonio Feijoo 2023-09-15 07:33:54 UTC
(In reply to Antonio Feijoo from comment #8)
> (In reply to Franck Bui from comment #6)
> > Antonio, I'm not sure how /etc/group is created inside initrd.
> > 
> > There's a dracut module "01systemd-sysusers" that seems to take care of
> > embedding systemd-sysusers but apparently this module is never used on any
> > SLE systems I have.
> > 
> > Could you shed some light here on the way /etc/group is initialized in
> > initrd ?
> 
> On the systemd [1] and udev-rules [2] modules. The systemd-sysusers module
> is only pulled in as a dependency by a few modules which are not usually
> included by default. There was an attempt to create sysusers at build time
> [3], but it was not merged.
> 
> [1]
> https://github.com/dracutdevs/dracut/blob/
> 6acfecae572fb457115b276b5b64d9424ad5187b/modules.d/00systemd/module-setup.
> sh#L203-L211
> 
> [2]
> https://github.com/dracutdevs/dracut/blob/
> 6acfecae572fb457115b276b5b64d9424ad5187b/modules.d/95udev-rules/module-setup.
> sh#L51-L64
> 
> [3] https://github.com/dracutdevs/dracut/pull/2067

I forgot the inst_rule_group_owner() function [1], it adds all the user/groups referenced by the installed udev rules.

[1] https://github.com/dracutdevs/dracut/blob/6acfecae572fb457115b276b5b64d9424ad5187b/dracut-init.sh#L482-L498
Comment 14 Antonio Feijoo 2023-09-15 08:20:25 UTC
Was there any change in the way the image is built?

$ sudo guestmount -a SLES15-SP5-Minimal-VM.x86_64-kvm-and-xen-Build4.2.5.qcow2 -m /dev/vda3 SLES15-SP5-Minimal-VM.x86_64-kvm-and-xen-Build4.2.5/
$ sudo lsinitrd -f etc/group SLES15-SP5-Minimal-VM.x86_64-kvm-and-xen-Build4.2.5/boot/initrd-5.14.21-150500.55.19-default
systemd-journal:x:498:
wheel:x:493:
root:x:0:
systemd-network:x:497:
kvm:x:36:
cdrom:x:11:
tape:x:33:
dialout:x:18:
floppy:x:19:
Comment 15 Franck Bui 2023-09-15 08:22:46 UTC
(In reply to Antonio Feijoo from comment #14)
> $ sudo lsinitrd -f etc/group
> SLES15-SP5-Minimal-VM.x86_64-kvm-and-xen-Build4.2.5/boot/initrd-5.14.21-
> 150500.55.19-default
> systemd-journal:x:498:
> wheel:x:493:
> root:x:0:
> systemd-network:x:497:
> kvm:x:36:
> cdrom:x:11:
> tape:x:33:
> dialout:x:18:
> floppy:x:19:

Jiri can you please involve someone who is involved in the (specific) installation process of MicroOS ?
Comment 16 Franck Bui 2023-09-15 08:24:28 UTC
Basically /etc/group is not populated like it should during the installation process.
Comment 17 Jiri Srain 2023-09-15 09:17:36 UTC
There is no really specific installation process - this is just a dump of kiwi image to the disk, nothing else (which is similar to MinimalOS). Could be Marcus Schaefer or David Cassany who can explain specifics of the Dracut initrd, both are now in CC.
Comment 18 Martin Loviska 2023-09-15 09:18:01 UTC
(In reply to Franck Bui from comment #4)
> The problem is likely related to the content of /etc/group inside the
> initramfs, which is probably empty or not complete.
> 
> Martin, do you have a chance to check that ? You simply need to break inside
> initrd by appending "rd.break" to boot options list and once the system
> stops at the end of the initrd process, check the content of /etc/group.

Sorry, maybe I am coming late for the party. Nevertheless, I have checked the minimal-vm of 15-SP6.

Here is the /etc/group file from initrd.

###########
systemd-journal:x:498:
wheel:x:493:
root:x:0:
systemd-network:x:497:
kvm:x:36:
cdrom:x:11:
tape:x:33:
dialout:x:18:
floppy:x:19:
###########

I have not found /usr/lib/sysusers.d/system-group-hardware.conf file that eventually contains the users in initrd
Comment 19 Martin Loviska 2023-09-15 09:19:08 UTC
> I have not found /usr/lib/sysusers.d/system-group-hardware.conf file that
> eventually contains the users in initrd

My bad, s/users/groups
Comment 20 Antonio Feijoo 2023-09-15 09:27:31 UTC
(In reply to Antonio Feijoo from comment #14)
> Was there any change in the way the image is built?
> 
> $ sudo guestmount -a
> SLES15-SP5-Minimal-VM.x86_64-kvm-and-xen-Build4.2.5.qcow2 -m /dev/vda3
> SLES15-SP5-Minimal-VM.x86_64-kvm-and-xen-Build4.2.5/
> $ sudo lsinitrd -f etc/group
> SLES15-SP5-Minimal-VM.x86_64-kvm-and-xen-Build4.2.5/boot/initrd-5.14.21-
> 150500.55.19-default
> systemd-journal:x:498:
> wheel:x:493:
> root:x:0:
> systemd-network:x:497:
> kvm:x:36:
> cdrom:x:11:
> tape:x:33:
> dialout:x:18:
> floppy:x:19:

FTR, the difference with the initrd provided by SLE-Micro.x86_64-5.3.0-Default-SelfInstall-GM.raw

$ sudo guestmount -a SLE-Micro.x86_64-5.3.0-Default-SelfInstall-GM.raw -m /dev/vda3 SLE-Micro.x86_64-5.3.0-Default-SelfInstall-GM
$ sudo lsinitrd -f etc/group SLE-Micro.x86_64-5.3.0-Default-SelfInstall-GM/boot/initrd-5.14.21-150400.24.18-default
systemd-journal:x:496:
wheel:x:476:
utmp:x:491:
root:x:0:
systemd-network:x:495:
messagebus:x:499:
nogroup:x:65533:
nobody:x:65534:
tty:x:5:
dialout:x:488:
kmem:x:493:
input:x:486:
video:x:481:
render:x:484:
sgx:x:483:
audio:x:490:
lp:x:485:
disk:x:487:
cdrom:x:489:
tape:x:482:
kvm:x:36:qemu
floppy:x:19:


dracut fills the /etc/group of the initrd based on the /etc/group of the running system, in this case the system where the bootable image is generated.
Comment 21 Franck Bui 2023-09-15 09:40:03 UTC
(In reply to Antonio Feijoo from comment #20)
> FTR, the difference with the initrd provided by
> SLE-Micro.x86_64-5.3.0-Default-SelfInstall-GM.raw

I think that the fact that /etc/group from initrd is incomplete is just a consequence that the /etc/group of the system being installed is not correctly initialized.
Comment 22 Jiri Srain 2023-09-15 10:50:42 UTC
I tried - as a possible workaround - to add at the end of image building (before creating initrd) a call of systemd-sysusers, it seems it created all the missing users (the build log is available at 

https://build.suse.de/build/home:jsrain:branches:SUSE:SLE-15-SP5:Update:Products:Micro55/images/x86_64/SLE-Micro:Default-RT-SelfInstall/_log

I only wonder whether or not this is a viable workaround.
Comment 23 Jiri Srain 2023-09-15 11:14:36 UTC
The image built with this change in config works correctly.

Franck, Antonio, do you see any issue if I use this workaround to unblock SLE Micro 5.5. RC1?
Comment 24 Antonio Feijoo 2023-09-15 12:04:53 UTC
(In reply to Jiri Srain from comment #23)
> The image built with this change in config works correctly.
> 
> Franck, Antonio, do you see any issue if I use this workaround to unblock
> SLE Micro 5.5. RC1?

If it solves the problem, I don't see any issue with this workaround, but I'm not a kiwi expert. I think it is still necessary to find the inner change that caused this.
Comment 25 Franck Bui 2023-09-15 12:18:29 UTC
(In reply to Antonio Feijoo from comment #24)
> (In reply to Jiri Srain from comment #23)
> > The image built with this change in config works correctly.
> > 
> > Franck, Antonio, do you see any issue if I use this workaround to unblock
> > SLE Micro 5.5. RC1?
> 
> If it solves the problem, I don't see any issue with this workaround, but
> I'm not a kiwi expert. I think it is still necessary to find the inner
> change that caused this.

I agree. This might be acceptable as a temporary workaround (if you need to release RC urgently) but IMHO the root cause of this issue should be investigated in any cases.
Comment 26 Jiri Srain 2023-09-15 12:23:50 UTC
I by no means expected that to be the final solution - but it seems to men that it could even be a maintance update (if released soon) for Micro.
Comment 30 Jose Lausuch 2023-09-15 15:19:07 UTC
*** Bug 1215365 has been marked as a duplicate of this bug. ***
Comment 31 Jose Lausuch 2023-09-17 08:51:19 UTC
*** Bug 1215363 has been marked as a duplicate of this bug. ***
Comment 32 xiaoli ai 2023-09-18 07:37:41 UTC
Based on virtualization test result in https://openqa.suse.de/tests/overview?build=22.2_4.2&groupid=515&version=5.5&distri=sle-micro, the issue does not happen in the new build.
Comment 35 Franck Bui 2023-09-19 13:09:26 UTC
FTR the installation logs showed:

> [   89s] [ DEBUG   ]: 15:22:32 | system: (313/544) Installing: system-group-hardware-20170617-150400.22.33.noarch [.
> [   89s] [ DEBUG   ]: 15:22:32 | system: warning: /var/cache/kiwi/packages/d5fb02be58068e14613f65622a261cfa/system-group-hardware.rpm: Header V3 RSA/SHA256 Signature, key ID 39db7c82: NOKEY
> [   89s] [ DEBUG   ]: 15:22:32 | system: .
> [   89s] [ DEBUG   ]: 15:22:32 | system: /var/tmp/rpm-tmp.fbfsM8: line 1: /usr/sbin/sysusers2shadow: No such file or directory

which suggested that sysuser-shadow package was installed too late.

The latest version of sysuser-shadow has the following changes:

> * Thu Aug 10 2023 kukuk@suse.com
> - Remove all systemd requires, not supported on SLE15 [bsc#1214140]

and apparently using this version helped to fix the current problem.

However I'm not completely sure why it did since the fact that sysuser-shadow doesn't depend on systemd anymore is irrelevant here: system-group-hardware only needs sysuser-shadow to be installed before it get installed.

Furthermore in the installation logs I couldn't find any dep cycles that would have explained why sysuser-shadow couldn't have been installed before system-group-hardware.

Thorsten, maybe you have an explanation ?
Comment 36 Thorsten Kukuk 2023-09-19 14:07:55 UTC
(In reply to Franck Bui from comment #35)

> However I'm not completely sure why it did since the fact that
> sysuser-shadow doesn't depend on systemd anymore is irrelevant here:
> system-group-hardware only needs sysuser-shadow to be installed before it
> get installed.
> 
> Furthermore in the installation logs I couldn't find any dep cycles that
> would have explained why sysuser-shadow couldn't have been installed before
> system-group-hardware.
> 
> Thorsten, maybe you have an explanation ?

The last systemd maintenance update changed the Requires for the hardware groups, which introduced a new dependency cycle. libsolv breaks this dependency cycle on a random location, sometimes the result works, sometimes not.
In theory libsolv should be able to solve this dependency loop with the hints in the RPMs we added for this reason without breaking it at the wrong location. Nobody knows why this did not work, most likely the libsolv code in SLE15 is too old. 

I don't know why somebody decided to break installations and released systemd without the workaround in sysuser-tools.
Meanwhile sysuser-tools 3.2 finally got released and the problems should be "solved" (until the next dependency loop, all this backports from Factory to SLE15 code base are really dangerous).
Comment 37 Franck Bui 2023-09-26 16:54:35 UTC
Closing as this is solved with the latest version of sysuser-shadow.