|
Bugzilla – Full Text Bug Listing |
| Summary: | Boot-Failure after applying Patch openSUSE-2017-847 / 950 / 1005 on LUKS encrypted NVMe devices | ||
|---|---|---|---|
| Product: | [openSUSE] openSUSE Distribution | Reporter: | Forgotten User XlNtqid6F5 <forgotten_XlNtqid6F5> |
| Component: | Maintenance | Assignee: | Daniel Molkentin <daniel> |
| Status: | RESOLVED DUPLICATE | QA Contact: | E-mail List <qa-bugs> |
| Severity: | Critical | ||
| Priority: | P5 - None | CC: | fbui, forgotten_XlNtqid6F5, novell, olivpass |
| Version: | Leap 42.3 | ||
| Target Milestone: | --- | ||
| Hardware: | x86-64 | ||
| OS: | openSUSE 42.3 | ||
| See Also: | http://bugzilla.opensuse.org/show_bug.cgi?id=1054616 | ||
| Whiteboard: | |||
| Found By: | --- | Services Priority: | |
| Business Priority: | Blocker: | Yes | |
| Marketing QA Status: | --- | IT Deployment: | --- |
| Attachments: |
output of journalctl
output of systemctl output of 'systemctl status' zypper history udev file 60-persistent-storage.rules used by the YaST installer Screenshot from YaST installer showing Device IDs of NVMe partition /etc/crypttab created by YaST installer /dev/disk/by-id/ listing while running YaST installer |
||
Created attachment 741781 [details]
output of systemctl
Created attachment 741782 [details]
output of 'systemctl status'
Created attachment 741783 [details]
zypper history
more debugging:
1) downgrade dracut to 044.1-23.2
2017-09-27 11:32:05|install|dracut|044.1-23.2|x86_64|root@linux-210y|repo-update|e3d230b5e79de0a603d1f5e4916760c965c908cf0d890fc1ebeb944ef1fb6c33|
1b) mkinitrd + reboot
>> boot FAILS / no improvement
2) downgrade dracut to 044-21.7 (with dependencies)
2017-09-27 11:39:52|install|systemd|228-27.2|x86_64|root@linux-210y|repo-oss|4bb2106ddadc9a02cbf5bc41db0cb23f936d4db0|
2017-09-27 11:39:52|install|udev|228-27.2|x86_64|root@linux-210y|repo-oss|6387f8b3aeb6926614d1d678823e25e1266b5076|
2017-09-27 11:39:52|install|systemd-sysvinit|228-27.2|x86_64|root@linux-210y|repo-oss|189fc4dccc70dd02414a4f6df6b3deac13af1587|
2017-09-27 11:39:53|install|dracut|044-21.7|x86_64|root@linux-210y|repo-oss|36de01742836205d426e946744525d96ab399b2b|
2b) mkinitrd + reboot
>> boot OK / Password query appears
I'm also experiencing this issue. I've traced it specifically to an update from udev-228-29.1 to udev-228-32.2 (with corresponding systemd update). Updating to udev-228-35.1 also causes the issue, but the breakage first appears with udev-228-32.2. This appears to be caused by a udev rules change. I notice that rules affecting NVMe were modified by the update. Here are steps showing that the udev rules are the culprit: * Install Leap 42.3 * Install all updates EXCEPT udev/systemd * Reboot, confirming normal system operation * Save old version of /usr/lib/udev/rules.d/60-persistent-storage.rules * Update udev/systemd to 228-32.2 (or 228-35.1) with zypper * Reboot, confirming boot failure (with no crypto password prompt after Grub) * Use rescue system to manually tweak /boot/initrd-4.4.87-25-default: * Remove /usr/lib/udev/rules.d/61-persistent-storage-compat.rules file added by update * Replace /usr/lib/udev/rules.d/60-persistent-storage.rules with saved copy * Reboot, confirming normal system operation (crypto password prompt restored, normal boot occurs, etc.) In other words, reverting only the two udev rules files in the initrd image is sufficient to "fix" the problem. I'll add that this system is using UEFI w/Secure Boot and GPT partitioning, if that makes a difference. Please let me know if I can provide any more information; I'm happy to help with udev logs or whatever is needed. More information:
LVM isn't actually involved; a system with a simple partition for the root filesystem will have the same issue if you check the box in the YaST installer to encrypt the partition. I've edited the title of the bug accordingly.
Here's the full cause of the issue:
udev file 60-persistent-storage.rules that ships with the YaST installer for 42.3 (attached to this bug as yast-installer-60-persistent-storage-rules.txt) has a rule (under the SCSI devices section, bizarrely) that creates an improperly-named symlink to each NVMe partition:
KERNEL=="sd*|cciss*|nvme*", ENV{DEVTYPE}=="partition", ENV{ID_SERIAL}=="?*", SYMLINK+="disk/by-id/$env{ID_BUS}-$env{ID_SERIAL}-part%n"
The flaw in the rule is the $env{ID_BUS} token; the variable ID_BUS has never been set for NVMe partitions (for disks, yes; for *partitions*, no). This results in a symlink with a leading hyphen, like -TRNSN34098GGX_NVMe_TOSHIBA_1024GB_10AZR11Z5QADR-part2, being created under /dev/disk/by-id.
There are two other symlinks created for the NVMe partition by other udev rules; unfortunately, the YaST installer picks up the misnamed one as Device ID 1, and this is what it puts in /etc/crypttab. (See yast-installer-device-id.png and etc-crypttab.txt, attached to this bug. A full listing of /dev/disk/by-id/ as it exists during the YaST installer's execution is attached as dev-disk-by-id.txt.)
Commit 63da94f (https://github.com/openSUSE/systemd/commit/63da94fcce059ac153be9b20657ae9fcc3b61e06) on July 3 modified the udev rule such that it no longer applied to NVMe devices:
KERNEL=="sd*|cciss*", ENV{DEVTYPE}=="partition", ENV{ID_SERIAL}=="?*", SYMLINK+="disk/by-id/$env{ID_BUS}-$env{ID_SERIAL}-part%n"
This means installing builds of udev containing this commit will cause the symlink placed in /etc/crypttab by the YaST installer to no longer be created, and the system will fail to boot.
Workaround:
After installing openSUSE Leap 42.3, but before installing updates, edit /etc/crypttab and fix /dev/disk/by-id/-<whatever> to be /dev/disk/by-id/nvme-<whatever>. This change needs to be added to the initrd image, as well; either run dracut -f, or just install updates (when the new udev/systemd package is installed, zypper will automatically run dracut for you).
Thomas Altrock, can you please confirm that this workaround works for you?
Created attachment 742641 [details]
udev file 60-persistent-storage.rules used by the YaST installer
Created attachment 742642 [details]
Screenshot from YaST installer showing Device IDs of NVMe partition
Created attachment 742643 [details]
/etc/crypttab created by YaST installer
Created attachment 742644 [details]
/dev/disk/by-id/ listing while running YaST installer
I have testet the workaround some minutes ago and i can CONFIRM that it works! @Jonathan: thanks for debugging and support! @Jonathan thanks for sorting this out. It's actually a duplicate of bug 1063249. *** This bug has been marked as a duplicate of bug 1063249 *** |
Created attachment 741780 [details] output of journalctl We did a fresh install of Leap 42.3 to an Samsung NVMe SSD (2TB 960 Pro M.2). The first boot after installation works fine. Applying all Patches with a 'zypper patch' will result in a boot-failure at next startup. On next startup the question for the luks/crypt-password is missing. After approx. 180 seconds timeout the Systems comes up in emergency mode with the following output: dracut-initqueue[307]: Warning: dracut-initqueue timeout - starting timeout scripts Sep 22 10:41:18 linux-6bio dracut-initqueue[307]: Warning: Could not boot. Sep 22 10:41:18 linux-6bio dracut-initqueue[307]: Warning: /dev/mapper/system-root does not exist Sep 22 10:41:18 linux-6bio dracut-initqueue[307]: Warning: /dev/system/root does not exist Sep 22 10:41:18 linux-6bio dracut-initqueue[307]: Warning: /dev/system/swap does not exist Sep 22 10:41:18 linux-6bio systemd[1]: Starting Setup Virtual Console... Sep 22 10:41:18 linux-6bio systemd[1]: Started Setup Virtual Console. Sep 22 10:41:18 linux-6bio systemd[1]: Starting Dracut Emergency Shell... some debugging: 1) New installation (Server / Text-Mode), no changes to default configuration / package selection 2a) Partitioning: LVM-Based proposal, ext4, no seperate home >> applying patch 'openSUSE-2017-847' >> boot OK 2b) Partitioning: LVM-Based proposal WITH CRYPT/LUKS, ext4, no seperate home >> applying patch 'openSUSE-2017-847' >> boot FAILS! 3) Applying patch 'openSUSE-2017-847' also installs 'openSUSE-2017-950' and 'openSUSE-2017-1005'. 4) The boot fails only on crypted NVMe drives. SATA drives are not affected. 5) Two different Hardware Systems (Notebook / Desktop-PC) have been testet. I have attached some files from the emergency shell.