Bug 1174285 - /.snapshots/grub-snapshot.cfg does not contain entries for snapshots created with transactional-update
Summary: /.snapshots/grub-snapshot.cfg does not contain entries for snapshots created ...
Status: NEW
Alias: None
Product: openSUSE Tumbleweed
Classification: openSUSE
Component: Basesystem (show other bugs)
Version: Current
Hardware: Other Other
: P5 - None : Normal (vote)
Target Milestone: ---
Assignee: Ignaz Forster
QA Contact: E-mail List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-07-19 15:50 UTC by Andreas Prittwitz
Modified: 2020-09-13 12:28 UTC (History)
5 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments
/.snapshots/grub-snapshot.cfg (186 bytes, text/plain)
2020-07-19 15:55 UTC, Andreas Prittwitz
Details
/boot/grub2/grub.cfg (11.45 KB, text/plain)
2020-07-19 15:58 UTC, Andreas Prittwitz
Details
/var/log/snapper.log snippet starting with the update from the comment before (18.93 KB, text/plain)
2020-07-20 16:43 UTC, Andreas Prittwitz
Details
snapper log snippet stating "not read-only snapshots" (11.67 KB, text/plain)
2020-07-21 14:12 UTC, Andreas Prittwitz
Details
support config tarball (2.68 MB, application/x-xz-compressed-tar)
2020-07-25 14:39 UTC, Andreas Prittwitz
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Andreas Prittwitz 2020-07-19 15:50:28 UTC
This is on openSUSE Tumbleweed 20200716, using transactional-update on a _read-write_ btrfs filesystem, _not_ a transactional-server installation.

When updating the system with 

tansactional-update -i cleanup dup

any new snapshot created by transactional-update is missing in /.snapshots/grub-snapshot.cfg, making it impossible to boot from a previously working snapshot.

When a snapshot is created manually with 

snapper create 

it correctly appears in that file and is bootable, so snapper is basically working correctly.

It is possible for me to add an entry for an existing transactional-update-snapshot by editing  /.snapshots/grub-snapshot.cfg and this - manually added - snapshot shows up in the grub2 menu as bootable snapshot and it can be selected and booted.

From a grub2 cmd-line if entering

ls /.snapshots

all available snapshots are seen, it is just that these snapshots do not get added to /.snapshots/grub-snapshot.cfg.

I cannot imagine that this behaviour is intentional.
Comment 1 Andreas Prittwitz 2020-07-19 15:55:05 UTC
Created attachment 839834 [details]
/.snapshots/grub-snapshot.cfg

/.snapshots> ls -l
total 4
drwxr-xr-x 1 root root  66 Jul 11 14:50 44
drwxr-xr-x 1 root root  66 Jul 18 19:15 45
-rw-r----- 1 root root 184 Jul 19 16:46 grub-snapshot.cfg
Comment 2 Andreas Prittwitz 2020-07-19 15:58:07 UTC
Created attachment 839835 [details]
/boot/grub2/grub.cfg
Comment 3 Andreas Prittwitz 2020-07-20 16:13:32 UTC
Today I made an update with

transactional-update -i cleanup dup

from 20200716 to 20200717 and observed the follwing.

- A new snapshot is created
- The new snapshot gets an entry with the correct number in

/.snapshots/grub-snapshot.cfg 

- After transactional-update is finished and the user is asked to reboot
the entry in

/.snapshots/grub-snapshot.cfg

is removed and contains only the lines in the attached grub.snapshot.cfg
Comment 4 Andreas Prittwitz 2020-07-20 16:43:35 UTC
Created attachment 839865 [details]
/var/log/snapper.log snippet starting with the update from the comment before
Comment 5 Andreas Prittwitz 2020-07-20 16:49:21 UTC
snapper ls
  # | Type   | Pre # | Date                          | User | Cleanup | Description            | Userdata
----+--------+-------+-------------------------------+------+---------+------------------------+---------
 0  | single |       |                               | root |         | current                |         
44  | single |       | Sat 11 Jul 2020 14:41:26 CEST | root | number  | Snapshot Update of #43 |         
45  | single |       | Sat 18 Jul 2020 19:03:26 CEST | root |         | Snapshot Update of #44 |         
47* | single |       | Mon 20 Jul 2020 17:58:04 CEST | root |         | Snapshot Update of #45 |
Comment 6 Andreas Prittwitz 2020-07-21 14:12:43 UTC
Created attachment 839891 [details]
snapper log snippet stating "not read-only snapshots"

Here is another snippet from /var/log/snapper.log.

It mentions the snapshots created with transactional-update (47 and 45) but also has a

"not read-only snapshots"

concerning these two snapshots even though when I query these snapshots with

btrfs property list -ts

it says

btrfs property list -ts /.snapshots/45/
ro                  read-only status of a subvolume
btrfs property list -ts /.snapshots/47/
ro                  read-only status of a subvolume
Comment 7 Andreas Prittwitz 2020-07-23 16:54:28 UTC
I tested further by creating a snapshot with

snapper create

which created a read-only snapshot #55
and 

snapper create --read-write

which created - well - a read-write snapshot #56.

Snapshot #55 has an entry in

/.snapshots/grub-snapshot.cfg

and #56 does not, which seems to be correct.

Still, I do not understand why snapshots created with transactional-update first get an entry created in /.snapshots/grub-snapshot.cfg and then deleted again, when tu finishes.
Comment 8 Andreas Prittwitz 2020-07-25 14:39:40 UTC
Created attachment 840041 [details]
support config tarball
Comment 9 Ignaz Forster 2020-08-19 14:21:13 UTC
Confirming.

On a regular (successful) run transactional-update will call
* snapper create --print-number --userdata "transactional-update-in-progress=yes"
* btrfs subvolume set-default ${BTRFS_ID} ${SNAPSHOT_DIR}
* snapper modify -u "transactional-update-in-progress=" ${SNAPSHOT_ID}

Setting the new default subvolume in the second step seems to cause the snapshot entry in /.snapshots/grub-snapshot.cfg to disappear on the next snapper call - but only on read-write systems.

Arvin, is this intentional behaviour?
Comment 10 Arvin Schnell 2020-08-24 05:25:35 UTC
The file grub-snapshot.cfg is not generated by snapper directly but
by the script /usr/lib/snapper/plugins/grub. So the question goes to
Michael Chang.
Comment 11 Michael Chang 2020-08-24 10:43:21 UTC
(In reply to Andreas Prittwitz from comment #6)
> Created attachment 839891 [details]
> snapper log snippet stating "not read-only snapshots"

[snip]

> btrfs property list -ts /.snapshots/45/
> ro                  read-only status of a subvolume
> btrfs property list -ts /.snapshots/47/
> ro                  read-only status of a subvolume

It looks like you have to use 'get' to check the readonly property of a subvolume whereas the 'list' subcommand is used to list available properties to an object to be obtained by 'get'. In addition the path to be queried should be /.snapshots/<num>/snapshot, eg:

> btrfs property get -ts /.snapshots/45/snapshot

Would you please check again fort the properties ? Thanks in advanced.
Comment 12 Andreas Prittwitz 2020-08-24 17:43:12 UTC
Sure. This is what I get:

btrfs property get -ts /.snapshots/65/snapshot
ro=false

btrfs property get -ts /.snapshots/66/snapshot
ro=false
Comment 13 Michael Chang 2020-08-25 04:55:52 UTC
(In reply to Andreas Prittwitz from comment #12)
> Sure. This is what I get:
> 
> btrfs property get -ts /.snapshots/65/snapshot
> ro=false
> 
> btrfs property get -ts /.snapshots/66/snapshot
> ro=false

So the new created, and later set as new default snapshots, are NOT readonly. Perhaps due to the btrfs root filesystem was initially mounted as read-writable.

I was told to explicitly exclude read-write snapshots from booting, see bug 878528, comment 19, and thus had implemented grub snapper plugin to only list read-only snapshots. I think this require more clarification as it seems to be design decision ..

Hi Ignaz and Thorsten,

Would you please shed some light on the (appears to) design decision ? Is it still applicable here? Thanks in advanced.
Comment 14 Thorsten Kukuk 2020-08-25 05:22:03 UTC
(In reply to Michael Chang from comment #13)

> Would you please shed some light on the (appears to) design decision ? Is it
> still applicable here? Thanks in advanced.

transactional-update for a read-write root filesystem was always only for debugging, not for production.
Since it is not clear if and how long we can continue to "support" this, I wouldn't change anything here.
Comment 15 Andreas Prittwitz 2020-09-13 12:24:14 UTC
Hello guys, thanks for your input.

Firstly, let me tell you that I think that transactional-update is an outstanding piece of software. To me it is one of the best things since sliced bread and I would regret it when it came to the point where it would no longer be usable on a read-write system.

Having an "atomically-upgradable-hybrid-system" at hand - doing a dup with transactional-update, but still being able to install software with zypper or Yast the traditional way - is something that is hard to outmatch in terms of safety and usability.

Secondly, being someone who is installing Tumbleweed with transactional-update on all (read-write)-machines of friends and relatives for over 1.5 years now, I can say that I have never heard any complaints about transactional-update not working from anybody.

With that being said, let me get back to topic to try to solve this with your help.

1. I am not authorized to view bug #878528 and thus do not know what it is about.

2.a Quote from Thorsten Kukuk:
"transactional-update for a read-write root filesystem was always only for debugging, not for production."

This is new to me.

According to https://kubic.opensuse.org/documentation/transactional-update-guide/tu-setup.html

"3.1. Read-only file system

transactional-update is typically used on a read-only root file system, even though it also _supports_ regular read-write systems."

and man transactional-update says:

"Standalone Commands
       rollback [number]
Sets the default root file system. On a read-only system the root file system is set directly using btrfs. On _read-write_ systems snapper(8) rollback is called."
           
which rather indicates that transactional-update's usage on read-write systems is encouraged as it should be, especially as there was work put into it to make the rollback feature work on read-write systems.

2.b Quote from Thorsten Kukuk:
"Since it is not clear if and how long we can continue to "support" this, I wouldn't change anything here."

Could you elaborate further, please? It is not clear to me what your statement means for the intended future of transactional-update in general and its future on read-write systems in particular. 
Also, leaving /usr/lib/snapper/plugins/grub as it is right now, to me seems questionable at best, see 4.

3. I am not a dev but I have some proposals to make that may pique your interest by providing possible solutions for keeping transactional-update working on read-write systems.
    
    3.a Proposal #1
    
    Change /usr/etc/transactional-update.conf and /usr/lib/snapper/plugins/grub as follows:
    
    Add a new variable to /usr/etc/transactional-update.conf like this
    
    # This is for experienced users only who want to use transactional-update on a read-write root filesystem and 
    # know what they are doing. 
    # Set the system-role variable accordingly. It is either transactional-server (read-only root) or read-write root for others
    # Default is "1" for transactional-server on read-only root
    # Valid values: 0 1
    #TRANSACTIONAL_SERVER=1

    Change line 138-141 in /usr/lib/snapper/plugins/grub as follows
    
    if [ ! -d ${s_dir} -o -w "$snapshot" -a "$TRANSACTIONAL_SERVER"="1" ]; then  # no read-write snapshots for tu-server
       continue
    fi 

    3.b Proposal #2
    
    The same as 3.a minus the change in /usr/etc/transactional-update.conf by querying the system-role in 
    /usr/lib/snapper/plugins/grub.
    
    E.g. for line 138-141
    if [ ! -d ${s_dir} -o -w "$snapshot" -a "system-role"="transactional-server" ]; then
        continue
    fi
    
By choosing one of the two proposals above (or a combination of the two) one more problem is addressed, see 4.

4. According to man snapper, snapper allows to create read-write snapshots. With the current state of /usr/lib/snapper/plugins/grub, users who decide to create read-write snapshots do not have them entered in /.snapshots/grub-snapshot.cfg and thus cannot boot from them.

Afaik, it is not stated anywhere that generally read-write snapshots are not to be included in /.snapshots/grub-snapshot.cfg but only read-write snapshots are 
or that no one should be booting from a read-write snapshot. In the current state of the grub plugin, read-write snapshots are unusable and good for nothing.

There also is no disadvantage for an experienced user in creating a read-write snapshot and being able to boot from it.

If there were any disadvantages or damage to be expected because of creating/using/booting from read-write snapshots, then the correct way would be to disable this feature in snapper or at least warn about it.
Having /usr/lib/snapper/plugins/grub in a state that generally cripples and restricts snapper's functionality for all users, even those not using transactional-update, is not the right approach.

Generally speaking, I do not think that this is the way it is supposed to work. The proposals I made can help in this regard, too.

5. Proposal #3

If the above proposals are not to anybody's liking, then what about setting all older snapshots ro after transactional-update has finished on a read-write system and before /usr/lib/snapper/plugins/grub is called, so that these older snapshots show up in snapshot.cfg.

After a transactional dup/up has taken place there is no need to have the new default snapshot in /.snapshots/grub-snapshot.cfg, so it can stay read-write, but the older ones should be, because these are the ones a user will be trying to reboot from in case the new default snapshot does not work.

Thanks for your time and good work.