Bug 1127766 - offline upgrade aborted if snapper fails
offline upgrade aborted if snapper fails
Status: CONFIRMED
Classification: openSUSE
Product: openSUSE Distribution
Classification: openSUSE
Component: YaST2
Leap 15.1
Other Other
: P5 - None : Normal (vote)
: ---
Assigned To: YaST Team
Jiri Srain
https://trello.com/c/pMzijcRv
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2019-03-04 16:33 UTC by Olaf Hering
Modified: 2020-02-19 11:48 UTC (History)
3 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Olaf Hering 2019-03-04 16:33:13 UTC
per bug#1048338, subvolume handing is (or was) suboptimal.
Today I tried to upgrade an existing subvolume to 15.1. I set the default to this subvolume and booted with 'upgrade=1'. Since I do not use snapshots, there is no /.snapshots. Something does not like that fact:


[Ruby] yast2/fs_snapshot.rb:317 Executing: "/usr/lib/snapper/installation-helper --step 5 --root-prefix=/mnt --snapshot-type pre --description before\ update --userdata "important=yes" --cleanup number"
[bash] ShellCommand.cc(shellcommand):78 reading failed
[bash] ShellCommand.cc(shellcommand):78 terminate called after throwing an instance of 'snapper::IOErrorException'
[bash] ShellCommand.cc(shellcommand):78   what():  open failed path:/mnt//.snapshots errno:2 (No such file or directory)
[Ruby] yast2/fs_snapshot.rb:323 Snapshot could not be created: /usr/lib/snapper/installation-helper --step 5 --root-prefix=/mnt --snapshot-type pre --description before\ update --userdata "important=yes" --cleanup number returned: {"exit"=>134, "stderr"=>"reading failed\nterminate called after throwing an instance of 'snapper::IOErrorException'\n  what():  open failed path:/mnt//.snapshots errno:2 (No such file or directory)\n", "stdout"=>""}
[Ruby] yast/builtins.rb:586 tostring builtin called on wrong type Class
[Ruby] yast/wfm.rb:253 Client /mounts/mp_0001/usr/share/YaST2/clients/inst_update_partition.rb failed with 'Filesystem snapshot could not be created.' (Yast2::SnapshotCreationFailed).

It seems the expectation is btrfs == snapshots.

Did all of our fresh btrfs installs have a /.snapshots? If yes, the assumption might be valid, except for bug#1048338.
Comment 1 Stefan Hundhammer 2019-03-06 10:08:29 UTC
This "installation-helper" call tries to create a pre-update snapshot. It shouldn't do that in the first place. That this fails because there is no .snapshots directory is only a consequence of that.

Now I wonder why it even wants to create that snapshot.
Comment 2 Ancor Gonzalez Sosa 2019-03-13 07:07:43 UTC
We definitely need the installation logs to debug this scenario.

https://en.opensuse.org/openSUSE:Report_a_YaST_bug
Comment 4 Ancor Gonzalez Sosa 2019-03-15 14:50:24 UTC
(In reply to Stefan Hundhammer from comment #1)
> This "installation-helper" call tries to create a pre-update snapshot. It
> shouldn't do that in the first place. That this fails because there is no
> .snapshots directory is only a consequence of that.
> 
> Now I wonder why it even wants to create that snapshot.

From the attached logs:

Checking if Snapper is configured: "/usr/bin/snapper --no-dbus --root=%{root} list-configs | /usr/bin/grep "^root " >/dev/null" returned: {"exit"=>0, "stderr"=>"", "stdout"=>""}

Since the exit code of that command is zero, the installer concludes Snapper if configured and a snapshot can/must be performed.
Comment 5 Ancor Gonzalez Sosa 2019-03-15 15:02:30 UTC
(In reply to Ancor Gonzalez Sosa from comment #4)
> (In reply to Stefan Hundhammer from comment #1)
> > This "installation-helper" call tries to create a pre-update snapshot. It
> > shouldn't do that in the first place. That this fails because there is no
> > .snapshots directory is only a consequence of that.
> > 
> > Now I wonder why it even wants to create that snapshot.
> 
> From the attached logs:
> 
> Checking if Snapper is configured: "/usr/bin/snapper --no-dbus
> --root=%{root} list-configs | /usr/bin/grep "^root " >/dev/null" returned:
> {"exit"=>0, "stderr"=>"", "stdout"=>""}
> 
> Since the exit code of that command is zero, the installer concludes Snapper
> if configured and a snapshot can/must be performed.

Needless to say, %{root} is substituted by /mnt when the command is executed.

Olaf, since your system is so special, would you mind to paste the full output of this command?

/usr/bin/snapper --no-dbus --root=/mnt list-configs

Executed in the installation media, after the system to upgrade has been mounted.

====

If that's too hard, I guess it would be enough to just execute this in any of the systems that are installed into that filesystem:

/usr/bin/snapper --no-dbus list-configs

====

Or alternatively do something like this with a rescue system

...activate the volume...
mount -t btrfs /dev/sd240_crypt_lvm/sd240_btrfs /mnt
/usr/bin/snapper --no-dbus --root=/mnt list-configs

The first alternative would be the best, but whatever works for you... I guess you get the idea.
Comment 6 Olaf Hering 2019-03-18 10:52:26 UTC
0:esprimo:~ # /usr/bin/snapper --no-dbus --root=/mnt list-configs
Konfiguration | Subvolumen
--------------+-----------
root          | /         
0:esprimo:~ #
Comment 7 Olaf Hering 2019-03-18 10:55:06 UTC
0:esprimo:~ # cat /mnt/etc/snapper/configs/root 

# subvolume to snapshot
SUBVOLUME="/"

# filesystem type
FSTYPE="btrfs"

# users and groups allowed to work with config
ALLOW_USERS=""
ALLOW_GROUPS=""

# start comparing pre- and post-snapshot in background after creating
# post-snapshot
BACKGROUND_COMPARISON="yes"


# run daily number cleanup
NUMBER_CLEANUP="yes"

# limit for number cleanup
NUMBER_MIN_AGE="1800"
NUMBER_LIMIT="50"


# create hourly snapshots
TIMELINE_CREATE="yes"

# cleanup hourly snapshots after some time
TIMELINE_CLEANUP="yes"

# limits for timeline cleanup
TIMELINE_MIN_AGE="1800"
TIMELINE_LIMIT_HOURLY="10"
TIMELINE_LIMIT_DAILY="10"
TIMELINE_LIMIT_MONTHLY="10"
TIMELINE_LIMIT_YEARLY="10"


# cleanup empty pre-post-pairs
EMPTY_PRE_POST_CLEANUP="yes"

# limits for empty pre-post-pair cleanup
EMPTY_PRE_POST_MIN_AGE="1800"

0:esprimo:~ # ls -l /mnt/etc/snapper/configs/root 
-rw-r--r-- 1 root root 800 Mar 22  2016 /mnt/etc/snapper/configs/root

I'm sure I never used snapper, perhaps a config as included in earlier pkgs?
Comment 8 Arvin Schnell 2019-03-19 10:57:08 UTC
(In reply to Ancor Gonzalez Sosa from comment #4)

> Since the exit code of that command is zero, the installer concludes Snapper
> is configured and a snapshot can/must be performed.

Which normally is fine. But the 'list-configs' command does not verify
that the configs are actually sound/available. Here that is not the
case.

(In reply to Olaf Hering from comment #7)

> I'm sure I never used snapper, perhaps a config as included in earlier pkgs?

The config file is not included in an RPM. Likely it was generated during
installation by YaST (using snapper).
Comment 9 Ancor Gonzalez Sosa 2019-03-19 11:59:39 UTC
(In reply to Arvin Schnell from comment #8)
> (In reply to Ancor Gonzalez Sosa from comment #4)
> 
> > Since the exit code of that command is zero, the installer concludes Snapper
> > is configured and a snapshot can/must be performed.
> 
> Which normally is fine. But the 'list-configs' command does not verify
> that the configs are actually sound/available. Here that is not the
> case.

And how can YaST check that?
Comment 10 Arvin Schnell 2019-03-19 16:13:50 UTC
(In reply to Ancor Gonzalez Sosa from comment #9)

> And how can YaST check that?

So far there is no snapper command to do that. Adding one might look
trivial at first but likely is not. Apart from checking mount points
one would also have to check mount flags (e.g. read-only, which btrfs
also sets during certain errors). Additional ACLs and SELinux might
make the checks more complicated.

I suggest to inform the user about the failure to create the snapshot
and let the user decide whether to continue or abort.
Comment 11 Ancor Gonzalez Sosa 2019-03-19 17:11:46 UTC
As a workaround, just delete the false snapper configuration and then the upgrade process should not try to create a snapshot and everything should work.

For a better long-term solution, I have created a Trello card so a more robust fix is implemented as other tasks and priorities permit.