Bug 1225882

Summary: Upgrade from Luks Opensuse MicroOS Desktop Fails
Product: [openSUSE] openSUSE Aeon Reporter: Erik Peterssen <openexplore1455>
Component: InstallationAssignee: Eugenio Paolantonio <eugenio.paolantonio>
Status: RESOLVED FIXED QA Contact: E-mail List <qa-bugs>
Severity: Normal    
Priority: P5 - None CC: eugenio.paolantonio, openexplore1455, rbrown
Version: Current   
Target Milestone: ---   
Hardware: 64bit   
OS: Other   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: Error Message After Inputing Password
Tik Log
Updated w Command
New TIk Log
Tik Log (Newest)
Tik Log(Newest)
Error Message After Inputing Password(Updated)
20-mig

Description Erik Peterssen 2024-06-03 20:28:16 UTC
Created attachment 875284 [details]
Error Message After Inputing Password

Hey All,

When i try to upgrade to the latest release of Opensuse Aeon RC2 i get an error as soon as it asks me for the password after clicking through the first step.
It basically stops and shutsdown after i type the password for the drive.

Can confirm its the correct one and i have attached the error message.
Let me know if you also need the tik.log
Comment 1 Erik Peterssen 2024-06-03 20:31:16 UTC
Created attachment 875285 [details]
Tik Log

Attaching the tik log which hopefully can indicate something.
Comment 2 Richard Brown 2024-06-03 20:42:27 UTC
Eugenio, as the author of our LUKS handling code would you like a first look?
Comment 3 Eugenio Paolantonio 2024-06-03 23:03:17 UTC
Sure, thanks for the bug report and log.

The luks devices closure happen after the migration, so if you don't see any other dialog in between the password entry and the error, I think there might be something else wrong as well.

Looking at the log, it looks like that the Legacy Aeon installation has been detected three times (while the installation process has been started ten times).
Is that correct?

I wonder if there is some race condition where the mapped unlocked device shows up too late, so it's not detected, and then for some reason something (btrfs?) keeps it busy.

These two issues might be separate.

Could you try executing in the terminal of the installer environment once the Welcome screen appears:

sudo -i
mount -o remount,rw /
btrfs property set / ro false
sed -i 's|probe_partitions $TIK_INSTALL_DEVICE|sleep 5;probe_partitions $TIK_INSTALL_DEVICE|' /usr/lib/tik/modules/pre/20-mig


then, please report if the migration module can reliably detect the encrypted partition (and eventually, if it doesn't fail anymore when calling luksClose).


Thanks!
Comment 4 Erik Peterssen 2024-06-04 22:07:36 UTC
(In reply to Eugenio Paolantonio from comment #3)
> Sure, thanks for the bug report and log.
> 
> The luks devices closure happen after the migration, so if you don't see any
> other dialog in between the password entry and the error, I think there
> might be something else wrong as well.
> 
> Looking at the log, it looks like that the Legacy Aeon installation has been
> detected three times (while the installation process has been started ten
> times).
> Is that correct?
> 
> I wonder if there is some race condition where the mapped unlocked device
> shows up too late, so it's not detected, and then for some reason something
> (btrfs?) keeps it busy.
> 
> These two issues might be separate.
> 
> Could you try executing in the terminal of the installer environment once
> the Welcome screen appears:
> 
> sudo -i
> mount -o remount,rw /
> btrfs property set / ro false
> sed -i 's|probe_partitions $TIK_INSTALL_DEVICE|sleep 5;probe_partitions
> $TIK_INSTALL_DEVICE|' /usr/lib/tik/modules/pre/20-mig
> 
> 
> then, please report if the migration module can reliably detect the
> encrypted partition (and eventually, if it doesn't fail anymore when calling
> luksClose).
> 
> 
> Thanks!


Hey Eugenio,

Appreciate you taking a look at this and i would say that is correct. Been trying to see if i could get it to continue fully for the setup, but it still wont allow me and produces that error. Im quite new to to this overall but really appreciate how approachable MicroOS have been. 

I went ahead and executed the commands but at the end it mentions that it cant find the module for some reason. See attachment and it still produces the error.
One thing i have noticed though is that if i choose the option to "skip" typing in the password the installation seems to continue. If i go through with that, would it render the device potentially unbootable since the its encrypted?
Comment 5 Erik Peterssen 2024-06-04 22:10:15 UTC
Created attachment 875308 [details]
Updated w Command
Comment 6 Erik Peterssen 2024-06-04 22:14:09 UTC
Created attachment 875309 [details]
New TIk Log
Comment 7 Eugenio Paolantonio 2024-06-05 09:17:21 UTC
Hi Erik,

sorry I think that bugzilla wrapped my command; the sed is actually one single line.


> sudo -i
> mount -o remount,rw /
> btrfs property set / ro false
> sed -i 's|probe_partitions $TIK_INSTALL_DEVICE|sleep 5;probe_partitions $TIK_INSTALL_DEVICE|' /usr/lib/tik/modules/pre/20-mig


If you skip, you can still install Aeon, but it will not migrate your existing installation so you would lose the existing data (please take a backup anyway just to be extra sure nevertheless :D).
Comment 8 Erik Peterssen 2024-06-06 06:27:48 UTC
Hey Eugenio,

Ah that might be the reason(In reply to Eugenio Paolantonio from comment #7)
> Hi Erik,
> 
> sorry I think that bugzilla wrapped my command; the sed is actually one
> single line.
> 
> 
> > sudo -i
> > mount -o remount,rw /
> > btrfs property set / ro false
> > sed -i 's|probe_partitions $TIK_INSTALL_DEVICE|sleep 5;probe_partitions $TIK_INSTALL_DEVICE|' /usr/lib/tik/modules/pre/20-mig
> 
> 
> If you skip, you can still install Aeon, but it will not migrate your
> existing installation so you would lose the existing data (please take a
> backup anyway just to be extra sure nevertheless :D).

Hey Eugenio,

Ah thanks for clarifying. The command now works and i noticed that it gets a step further in identifying the disk(shows the drive loading screen), but ultimately it still crashes with the same error:/ Attaching the updated.tik log which hopefully can show what happens.
Comment 9 Erik Peterssen 2024-06-06 06:29:04 UTC
Created attachment 875348 [details]
Tik Log (Newest)
Comment 10 Erik Peterssen 2024-06-06 06:42:01 UTC
(In reply to Eugenio Paolantonio from comment #7)
> Hi Erik,
> 
> sorry I think that bugzilla wrapped my command; the sed is actually one
> single line.
> 
> 
> > sudo -i
> > mount -o remount,rw /
> > btrfs property set / ro false
> > sed -i 's|probe_partitions $TIK_INSTALL_DEVICE|sleep 5;probe_partitions $TIK_INSTALL_DEVICE|' /usr/lib/tik/modules/pre/20-mig
> 
> 
> If you skip, you can still install Aeon, but it will not migrate your
> existing installation so you would lose the existing data (please take a
> backup anyway just to be extra sure nevertheless :D).

Also thanks for explaining that the installation would go through and that it wont brick it, but definitely will take backup of existing data just to be extra sure:) I can wait for a few days though if you want to test something else.
Comment 11 Eugenio Paolantonio 2024-06-06 20:22:58 UTC
Hi Erik, thanks for trying out what I suggested!

I think that a similar race condition might happen after unmounting the old partition. Apparently this has been observed already with btrfs [0].

Could you try running the same commands as before, but replacing the last command (sed) with this instead?

> sed -i 's|prun /usr/sbin/cryptsetup luksClose|sleep 5;prun /usr/sbin/cryptsetup luksClose|' /usr/lib/tik/modules/pre/20-mig


This is not going to be the actual fix, but if it works it might confirm my suspicions above.


Thanks!

[0] https://old.reddit.com/r/btrfs/comments/z8x15k/umount_returning_even_though_btrfscleaner_is/
Comment 12 Erik Peterssen 2024-06-09 18:03:13 UTC
(In reply to Eugenio Paolantonio from comment #11)
> Hi Erik, thanks for trying out what I suggested!
> 
> I think that a similar race condition might happen after unmounting the old
> partition. Apparently this has been observed already with btrfs [0].
> 
> Could you try running the same commands as before, but replacing the last
> command (sed) with this instead?
> 
> > sed -i 's|prun /usr/sbin/cryptsetup luksClose|sleep 5;prun /usr/sbin/cryptsetup luksClose|' /usr/lib/tik/modules/pre/20-mig
> 
> 
> This is not going to be the actual fix, but if it works it might confirm my
> suspicions above.
> 
> 
> Thanks!
> 
> [0]
> https://old.reddit.com/r/btrfs/comments/z8x15k/
> umount_returning_even_though_btrfscleaner_is/

Hey Eugenio,

You are much welcome and interesting.
I tried the new command and it seems to stay longer identifying the disk until it unfortunately provides the same error message. Attaching error message and the new tik log. Let me know if you find anything from it and if i can try something else before going through with reinstall.
Comment 13 Erik Peterssen 2024-06-09 18:04:07 UTC
Created attachment 875389 [details]
Tik Log(Newest)
Comment 14 Erik Peterssen 2024-06-09 18:05:52 UTC
Created attachment 875390 [details]
Error Message After Inputing Password(Updated)
Comment 15 Eugenio Paolantonio 2024-06-09 21:19:42 UTC
Thanks again!

Please try the attached 20-mig file that has some fixes. Related Pull Request is on GitHub [0].

To try it out, download the attachment, put it into the IGNITION partition in the Aeon install media, then reboot to the installation environment and (while the Aeon installer is still in the main, welcome screen):


> sudo -i
> btrfs property set / ro false
> cp /ignition/20-mig /usr/lib/tik/modules/pre/20-mig


[0] https://github.com/sysrich/tik/pull/29
Comment 16 Eugenio Paolantonio 2024-06-09 21:20:03 UTC
Created attachment 875391 [details]
20-mig
Comment 17 Richard Brown 2024-06-10 18:50:42 UTC
Fixed in tik 1.0.7, otw to Factory
Comment 18 OBSbugzilla Bot 2024-06-10 19:25:03 UTC
This is an autogenerated message for OBS integration:
This bug (1225882) was mentioned in
https://build.opensuse.org/request/show/1179827 Factory / tik
Comment 19 Richard Brown 2024-06-13 08:35:59 UTC
released in 0611