Bug 1221494

Summary: Boot fails with systemd-executor unable to execute
Product: [openSUSE] openSUSE Tumbleweed Reporter: Robrert Horn <rjhorniii>
Component: BootloaderAssignee: systemd maintainers <systemd-maintainers>
Status: RESOLVED WORKSFORME QA Contact: E-mail List <qa-bugs>
Severity: Major    
Priority: P1 - Urgent CC: antonio.feijoo, kukuk, rjhorniii, thorsteinn-opensuse
Version: Current   
Target Milestone: ---   
Hardware: x86   
OS: openSUSE Tumbleweed   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---

Description Robrert Horn 2024-03-16 00:10:14 UTC
This failure occurred shortly after a zypper dup for tumbleweed 15 March 2024 06:46:23 PM EDT.  The boot failed almost immediately with a message that systemd-executor was unable to execute.  There was also a mention of firmware update (for UEFI?).

I don't have much more info since I immediately did a rollback to the pre-upgrade snapshot.  I'm not at a convenient time to try to resolve boot issues by experimenting.
Comment 1 Robrert Horn 2024-03-17 12:25:03 UTC
My speculation after a quick look at the pre and post snapshots is that this is the result of an incomplete switch from Grub to UEFI systemd boot.  One clue of sorts is that the EFI partition is completely empty.

The machine is old (2015 is the latest BIOS) and configured for Grub booting at the time after some puzzling issues between that 2015 version of OpenSuse and that version of a UEFI BIOS.  It is configured in the BIOS for legacy boot and secure boot is disabled.

My guess is that something in package dependency checking got confused.  It should have left the boot alone.  

I do have the pre and post snapshots to examine if there is something that would help.
Comment 2 Þorsteinn Jón Gautason 2024-03-17 22:06:28 UTC
I might have hit the same bug, after an update today, 17th of March, booting immediately and the whole of the logs on screen are:

systemd[1]: Failed to open executor binary '/usr/lib/systemd/systemd-executor': No such file or directory
systemd[1]: Failed to allocate manager object: No such file or directory
[!!!!!!] Failed to allocate manager object.
systemd[1]: Freezing execution.

Booting to recovery mode, previosly installed kernel or recovery mode of previously installed kernel fails.

This is on a recent Thinkpad X1 Carbon, I'm not sure if there were any UEFI updates.
Comment 3 Antonio Feijoo 2024-03-18 08:15:11 UTC
(In reply to Þorsteinn Jón Gautason from comment #2)
> I might have hit the same bug, after an update today, 17th of March, booting
> immediately and the whole of the logs on screen are:
> 
> systemd[1]: Failed to open executor binary
> '/usr/lib/systemd/systemd-executor': No such file or directory

This seems a different bug. The initrd must contain the new `systemd-executor` binary, so it should be regenerated after the update. Could you confirm that with `lsinitrd /boot/<latest-kver> | grep systemd-executor`?
Comment 4 Thorsten Kukuk 2024-03-18 08:37:51 UTC
(In reply to Robrert Horn from comment #1)
> My speculation after a quick look at the pre and post snapshots is that this
> is the result of an incomplete switch from Grub to UEFI systemd boot.  One
> clue of sorts is that the EFI partition is completely empty.

We don't do such a switch, this is something you need to do manual.

Could it be that you didn't had the package "suse-module-tools-scriptlets" installed and to fulfill dependencies, "sdbootutil-rpm-scriptlets" got choosen by zypper?
Comment 5 Þorsteinn Jón Gautason 2024-03-18 09:10:40 UTC
(In reply to Antonio Feijoo from comment #3)
> (In reply to Þorsteinn Jón Gautason from comment #2)
> > I might have hit the same bug, after an update today, 17th of March, booting
> > immediately and the whole of the logs on screen are:
> > 
> > systemd[1]: Failed to open executor binary
> > '/usr/lib/systemd/systemd-executor': No such file or directory
> 
> This seems a different bug. The initrd must contain the new
> `systemd-executor` binary, so it should be regenerated after the update.
> Could you confirm that with `lsinitrd /boot/<latest-kver> | grep
> systemd-executor`?

Should I submit another bug report for this?

  # lsinitrd /boot/initrd-6.7.9-1-default | grep systemd-executor

returns nothing.
Comment 6 Antonio Feijoo 2024-03-18 09:18:53 UTC
(In reply to Þorsteinn Jón Gautason from comment #5)
> Should I submit another bug report for this?
> 
>   # lsinitrd /boot/initrd-6.7.9-1-default | grep systemd-executor
> 
> returns nothing.

Two questions:
- Could you run `dracut -f --regenerate-all` and check again?
- Did you update your systemd using `zypper dup` or `zypper update`?
Comment 7 Þorsteinn Jón Gautason 2024-03-18 09:23:35 UTC
(In reply to Antonio Feijoo from comment #6)
> (In reply to Þorsteinn Jón Gautason from comment #5)
> > Should I submit another bug report for this?
> > 
> >   # lsinitrd /boot/initrd-6.7.9-1-default | grep systemd-executor
> > 
> > returns nothing.
> 
> Two questions:
> - Could you run `dracut -f --regenerate-all` and check again? 
> - Did you update your systemd using `zypper dup` or `zypper update`?

1. Still the same after running `dracut -f --regenerate-all`, I saw that it did generate a new /boot/initrd-6.7.9-1-default file but the problem persists
2. I updated by calling `zypper update`
Comment 8 Þorsteinn Jón Gautason 2024-03-18 09:24:30 UTC
(In reply to Þorsteinn Jón Gautason from comment #7)
> (In reply to Antonio Feijoo from comment #6)
> > (In reply to Þorsteinn Jón Gautason from comment #5)
> > > Should I submit another bug report for this?
> > > 
> > >   # lsinitrd /boot/initrd-6.7.9-1-default | grep systemd-executor
> > > 
> > > returns nothing.
> > 
> > Two questions:
> > - Could you run `dracut -f --regenerate-all` and check again? 
> > - Did you update your systemd using `zypper dup` or `zypper update`?
> 
> 1. Still the same after running `dracut -f --regenerate-all`, I saw that it
> did generate a new /boot/initrd-6.7.9-1-default file but the problem persists
> 2. I updated by calling `zypper update`

The command above was run on a fully up to date system.
Comment 9 Antonio Feijoo 2024-03-18 09:29:14 UTC
(In reply to Þorsteinn Jón Gautason from comment #7)
> (In reply to Antonio Feijoo from comment #6)
> > Two questions:
> > - Could you run `dracut -f --regenerate-all` and check again? 
> > - Did you update your systemd using `zypper dup` or `zypper update`?
> 
> 1. Still the same after running `dracut -f --regenerate-all`, I saw that it
> did generate a new /boot/initrd-6.7.9-1-default file but the problem persists

I think you are you running this command on a previous snapshot with the previous systemd version (v254), so the required binary (/usr/lib/systemd/systemd-executor) does not exist yet. I guess you can `zypper dup` from there to fix everything and boot into the new snapshot.

> 2. I updated by calling `zypper update`

That's why the initrd was not regenerated, in Tumbleweed you must use only `zypper dup`.
Comment 10 Þorsteinn Jón Gautason 2024-03-18 09:51:55 UTC
(In reply to Antonio Feijoo from comment #9)
> (In reply to Þorsteinn Jón Gautason from comment #7)
> > (In reply to Antonio Feijoo from comment #6)
> > > Two questions:
> > > - Could you run `dracut -f --regenerate-all` and check again? 
> > > - Did you update your systemd using `zypper dup` or `zypper update`?
> > 
> > 1. Still the same after running `dracut -f --regenerate-all`, I saw that it
> > did generate a new /boot/initrd-6.7.9-1-default file but the problem persists
> 
> I think you are you running this command on a previous snapshot with the
> previous systemd version (v254), so the required binary
> (/usr/lib/systemd/systemd-executor) does not exist yet. I guess you can
> `zypper dup` from there to fix everything and boot into the new snapshot.
> 
> > 2. I updated by calling `zypper update`
> 
> That's why the initrd was not regenerated, in Tumbleweed you must use only
> `zypper dup`.

Well I probably should have known that, sorry for the confusion and thanks a lot for your help, the system boots fine now.
Comment 11 Robrert Horn 2024-03-18 12:48:17 UTC
I checked and a version of "suse-module-tools-scriptlets" was installed 03 Feb 2024.  The "sdbootutil-rpm-scriptlets" is not installed.

I found that the zypper.log and /var/log/zypp/history were still there after the rollback.  The "sdbootutil-rpm-scriptlets" is not mentioned in them.  The dracut output is in the history file.  I didn't notice anything obviously wrong, but I don't know what to look for.  The history is rather large (1 MB gzipped) but I can attach it if that helps.  (There were some dependency issues for vlc and digikam, but that is probably just a distraction.)

I'm at a conference this week, so I can look at files but it's awkward to rebuild or reboot.

This may be somehow related to the other report.  I got the same error messages. But in my case I was definitely using zypper -dup.
Comment 12 Robrert Horn 2024-03-25 16:42:18 UTC
My conference is over and I tried the latest tumbleweed onto an unused partition on another disk, and then onto a new blank SSD. The problem is not repeatable on those.  Whatever is causing the problem is something very specific to the one disk and its contents.

I cleared the partition table on another SSD and let tumbleweed installer do everything default during installation of a basic XFCE system.  It all installed and the system boots.  I restored user files from backup immediately before switching drives, and have restored most of the software packages.  No problems so far.

So I'd mark this failure as unable to reproduce.

I was surprised by one thing.  The default installation set up an MBR bootable disk with no EFI partition on the SSD with an empty partition table.  This works, but I had to manually add the drive to the boot list in the BIOS.  I'm not sure how the BIOS decides what is bootable, but I'm guessing it looks for the EFI partition.  Even my old system has EUFI BIOS, and it's 9 years old.

I can live with that.
Comment 13 Antonio Feijoo 2024-04-15 14:20:14 UTC
(In reply to Robrert Horn from comment #12)
> My conference is over and I tried the latest tumbleweed onto an unused
> partition on another disk, and then onto a new blank SSD. The problem is not
> repeatable on those.  Whatever is causing the problem is something very
> specific to the one disk and its contents.
> 
> I cleared the partition table on another SSD and let tumbleweed installer do
> everything default during installation of a basic XFCE system.  It all
> installed and the system boots.  I restored user files from backup
> immediately before switching drives, and have restored most of the software
> packages.  No problems so far.
> 
> So I'd mark this failure as unable to reproduce.

Closing then this bug report.