Bugzilla – Bug 1221405
plasma5-workspace -> plasma6-workspace upgrade kills running session
Last modified: 2024-03-25 13:42:31 UTC
When upgrading to TW snapshot 20240311, plasma5-workspace will be replaced by plasma6-workspace due to Provides/Obsoletes. The issue is caused by plasma5-workspace's %preun script: %systemd_user_preun plasma-gmenudbusmenuproxy.service plasma-kcminit-phase1.service ... which expands to if [ $1 -eq 0 ] && [ -x /usr/lib/systemd/systemd-update-helper ]; then # Package removal, not upgrade /usr/lib/systemd/systemd-update-helper remove-user-units plasma-gmenudbusmenuproxy.service plasma-kcminit-phase1.service plasma-kcminit.service \ plasma-krunner.service plasma-ksmserver.service plasma-ksplash-ready.service plasma-plasmashell.service \ plasma-xembedsniproxy.service plasma-baloorunner.service plasma-restoresession.service plasma-ksplash.service || : fi This call results in an unconditional call to "systemctl --user disable --now" calls for *all* units for *all* sessions, terminating the session immediately. This is critical in the case that "zypper dup" is run as part of the session, as it will be killed in the middle of the transaction, leaving the system in a partially upgraded state. Question is how to fix this retroactively. It's not possible to remove %preun of the old plasma5-workspace package anymore. Is there a workaround we can apply in plasma6-workspace's %post, which runs before %preun of the old package?
(In reply to Fabian Vogt from comment #0) > This call results in an unconditional call to "systemctl --user disable > --now" calls for *all* units for *all* sessions, terminating the session > immediately. This crude termination is an already fixed bug, right? > Is there a workaround we can apply in plasma6-workspace's %post, which runs > before %preun of the old package? May add a drop-in with RefuseManualStop=yes to the scope unit containing the updating zypper? (Perhaps the very same one that executes the scriptlets.)
(In reply to Michal Koutný from comment #1) > (In reply to Fabian Vogt from comment #0) > > This call results in an unconditional call to "systemctl --user disable > > --now" calls for *all* units for *all* sessions, terminating the session > > immediately. > > This crude termination is an already fixed bug, right? No, how? That's what this bug report is about. > > Is there a workaround we can apply in plasma6-workspace's %post, which runs > > before %preun of the old package? > > May add a drop-in with RefuseManualStop=yes to the scope unit containing the > updating zypper? (Perhaps the very same one that executes the scriptlets.) Do you mean actually to the scope unit which is closest to the current process? Not sure whether there is any, it's possible that an application was launched in a way that it didn't get one. Applications usually get one, but there are ways which bypass this... Or do you mean plasma6-workspace could set RefuseManualStop=yes on the service files shipped by the new package and do a systemd-update-helper user-reload in %post to make it effective before the %preun? That might work, but I'm not sure whether that breaks anything. Worth a try though. It could also be added as a transient dropin in /run and deleted again in %posttrans.
(In reply to Fabian Vogt from comment #2) > No, how? That's what this bug report is about. I understand that stopping all units of all sessions (per your explanation) is a mistaken action and not all should have been stopped. And I thought this bug is about saving users who would still have the old version of the package with this excessive action. (And that the new package has a better targeted action in its %postun.) Maybe I misunderstood the problem here. > Do you mean actually to the scope unit which is closest to the current > process? Yes. Assuming the current process is in the same unit as zypper. > Not sure whether there is any, it's possible that an application > was launched in a way that it didn't get one. Applications usually get one, > but there are ways which bypass this... Every process is in a unit. It should work with any application or non-application unit. > Or do you mean plasma6-workspace could set RefuseManualStop=yes on the > service files shipped by the new package and do a systemd-update-helper > user-reload in %post to make it effective before the %preun? No, nothing permanent. I meant a runtime drop-in placed in new:%post before old:%postun. I didn't think too much about cleanup (reboot obviously). But %posttrans as you write could manage to do that.
(In reply to Michal Koutný from comment #1) > (In reply to Fabian Vogt from comment #0) > > This call results in an unconditional call to "systemctl --user disable > > --now" calls for *all* units for *all* sessions, terminating the session > > immediately. > > This crude termination is an already fixed bug, right? It was not fixed yesterday around 6PM when I updated Tumbleweed. The desktop session running the update got terminated and I had to continue from framebuffer console because KDE/Plasma was left in usuable state by the crashed update.
Have my sympathies. What I meant by "crude termination" is fixed -- the situation will not happen with another another update of plasma6-workspace? I.e. the flawed scriptlet is only in plasma5-workspace?
*** Bug 1221524 has been marked as a duplicate of this bug. ***
(In reply to Fabian Vogt from comment #0) > This call results in an unconditional call to "systemctl --user disable > --now" calls for *all* units for *all* sessions, terminating the session > immediately. So one of the previously listed services contains the process leader of the terminated session ? How is plasma6-workspace supposed to replace plasma5-workspace ? Do the 2 versions share the exact same set of units and therefore starting plasma6-workspace is basically a nop as the services are already running when plasma6-workspace is started ? (but that would mean that the old processes from plasma5 would be running while the files from plasma6 are installed...). Or does it exist a smarter migration path where the session is migrated somehow to the set of processes belonging to the new version of plasma-workspace ?
(In reply to Michal Koutný from comment #3) > (In reply to Fabian Vogt from comment #2) > > No, how? That's what this bug report is about. > > I understand that stopping all units of all sessions (per your explanation) > is a mistaken action and not all should have been stopped. > And I thought this bug is about saving users who would still have the old > version of the package with this excessive action. (And that the new package > has a better targeted action in its %postun.) Yes, it is. It affects all users on Leap and TW < 20240311. > Maybe I misunderstood the problem here. > > > Do you mean actually to the scope unit which is closest to the current > > process? > > Yes. Assuming the current process is in the same unit as zypper. > > > Not sure whether there is any, it's possible that an application > > was launched in a way that it didn't get one. Applications usually get one, > > but there are ways which bypass this... > > Every process is in a unit. It should work with any application or > non-application unit. Yes. Depending how and where zypper was launched, it could be either a scope unit created by plasma, a child of some app-*.service autostart unit or some other service which can launch applications. > > Or do you mean plasma6-workspace could set RefuseManualStop=yes on the > > service files shipped by the new package and do a systemd-update-helper > > user-reload in %post to make it effective before the %preun? > > No, nothing permanent. I meant a runtime drop-in placed in new:%post before > old:%postun. I didn't think too much about cleanup (reboot obviously). But > %posttrans as you write could manage to do that. Perfect. I got that to work and submitted it as https://build.opensuse.org/request/show/1158726. (In reply to Michal Koutný from comment #5) > Have my sympathies. > > What I meant by "crude termination" is fixed -- the situation will not > happen with another another update of plasma6-workspace? I.e. the flawed > scriptlet is only in plasma5-workspace? Yes. (In reply to Franck Bui from comment #7) > (In reply to Fabian Vogt from comment #0) > > This call results in an unconditional call to "systemctl --user disable > > --now" calls for *all* units for *all* sessions, terminating the session > > immediately. > > So one of the previously listed services contains the process leader of the > terminated session ? Yes. > How is plasma6-workspace supposed to replace plasma5-workspace ? > > Do the 2 versions share the exact same set of units and therefore starting > plasma6-workspace is basically a nop as the services are already running > when plasma6-workspace is started ? (but that would mean that the old > processes from plasma5 would be running while the files from plasma6 are > installed...). Correct. > Or does it exist a smarter migration path where the session is migrated > somehow to the set of processes belonging to the new version of > plasma-workspace ? No. Plasma 5 stays running until logout or reboot. This works fine, as files are not overwritten but replaced, thus running processes are unaffected by the upgrade. Obviously it's recommended to do as little as possible during/after an upgrade, but it's supposed to work.
While this bug report is only about the specific issue with plasma5 -> plasma6, it's a deeper issue with the systemd macros and package renames: # rpm --eval "%{systemd_user_preun foo.service}" if [ $1 -eq 0 ] && [ -x /usr/lib/systemd/systemd-update-helper ]; then # Package removal, not upgrade /usr/lib/systemd/systemd-update-helper remove-user-units foo.service || : fi # rpm --eval "%{systemd_preun foo.service}" : if [ $1 -eq 0 ] && [ -x /usr/lib/systemd/systemd-update-helper ]; then # Package removal, not upgrade /usr/lib/systemd/systemd-update-helper remove-system-units foo.service || : fi In the context of %preun, "$1" shows how many of the current package name are installed after this part of the transaction. In the case of package renames, this is 0 even though it's an upgrade from a user and packager PoV. What could be done instead is to check in %postun whether the .service file still exists and if not, run the removal actions. For Plasma units it would probably make sense to just remove the %preun scripts completely, but I'm not sure whether that's correct for Plasma or in general.
Would DISABLE_STOP_ON_REMOVAL="yes" in /etc/sysconfig/services mitigate this?
(In reply to Fabian Vogt from comment #9) > For Plasma units it would probably make sense to just remove the %preun > scripts completely, but I'm not sure whether that's correct for Plasma or in > general. For actual services, stopping on removal make sense to me, even if there is a package rename. I think the problem here is that these are not real services, they are parts of a session and KDE abuses the service mechanism, presumably to get separate systemd cgroups for these processes. (Though one could argue that the issue is with systemd not allowing to subdivide a session scope.) So these "services" should probably not be stopped, just like an update of Plasma shouldn't restart them.
(In reply to Aaron Puchert from comment #10) > Would DISABLE_STOP_ON_REMOVAL="yes" in /etc/sysconfig/services mitigate this? IIRC only DISABLE_RESTART_ON_UPDATE is still used by %service_del_postun. (In reply to Aaron Puchert from comment #11) > (In reply to Fabian Vogt from comment #9) > > For Plasma units it would probably make sense to just remove the %preun > > scripts completely, but I'm not sure whether that's correct for Plasma or in > > general. > > For actual services, stopping on removal make sense to me, even if there is > a package rename. Why? I can't think of a single case where this makes sense. A package rename is simply not removal and should be as close to a noop as possible. > I think the problem here is that these are not real > services, they are parts of a session and KDE abuses the service mechanism, Plasma follows https://systemd.io/DESKTOP_ENVIRONMENTS/, like other DEs. Previously the whole session was effectively part of display-manager.service. > presumably to get separate systemd cgroups for these processes. (Though one > could argue that the issue is with systemd not allowing to subdivide a > session scope.) systemd allows scope creation, Plasma already does that for launching of applications. > So these "services" should probably not be stopped, just like an update of > Plasma shouldn't restart them.
(In reply to Fabian Vogt from comment #12) > (In reply to Aaron Puchert from comment #10) > > Would DISABLE_STOP_ON_REMOVAL="yes" in /etc/sysconfig/services mitigate this? > > IIRC only DISABLE_RESTART_ON_UPDATE is still used by %service_del_postun. I can't find any use for it, so that's probably true. Was just wondering why the setting is still there. > (In reply to Aaron Puchert from comment #11) > > (In reply to Fabian Vogt from comment #9) > > > For Plasma units it would probably make sense to just remove the %preun > > > scripts completely, but I'm not sure whether that's correct for Plasma or in > > > general. > > > > For actual services, stopping on removal make sense to me, even if there is > > a package rename. > > Why? I can't think of a single case where this makes sense. A package rename > is simply not removal and should be as close to a noop as possible. I'll admit, there are lots of scenarios one might be thinking about. What I was thinking about is where the service is also renamed along with the package. > > I think the problem here is that these are not real > > services, they are parts of a session and KDE abuses the service mechanism, > > Plasma follows https://systemd.io/DESKTOP_ENVIRONMENTS/, like other DEs. Yeah, I'm aware of that. But the fact that there is now an official document for it doesn't change my view that systemd services are a bad fit for most parts of a session. The first section already points out an obvious limitation of this, because there can only be one user@.service per user, and hence only one session. (Not saying that I need multiple graphical sessions, but it's telling that this is impossible with such a design.) This is slightly off-topic, I'm just saying that these are not services in any conventional sense. Or if they are, then "service" has just become a meaningless word. (If everything is a service, then what does being a service mean?) So whatever we do for services has no bearing here, and should be reevaluated separately. > Previously the whole session was effectively part of display-manager.service. That must have been a long time ago. In my earliest memories, the whole session was under a session-*.scope. That should have been the case since the introduction of systemd-logind. At least parts of the session still are there. > > presumably to get separate systemd cgroups for these processes. (Though one > > could argue that the issue is with systemd not allowing to subdivide a > > session scope.) > > systemd allows scope creation, Plasma already does that for launching of > applications. It does, but not under session-*.scope. Generally I think scopes can't be nested.
(In reply to Aaron Puchert from comment #13) > (In reply to Fabian Vogt from comment #12) > > (In reply to Aaron Puchert from comment #10) > > > Would DISABLE_STOP_ON_REMOVAL="yes" in /etc/sysconfig/services mitigate this? > > > > IIRC only DISABLE_RESTART_ON_UPDATE is still used by %service_del_postun. > > I can't find any use for it, so that's probably true. Was just wondering why > the setting is still there. > > > (In reply to Aaron Puchert from comment #11) > > > (In reply to Fabian Vogt from comment #9) > > > > For Plasma units it would probably make sense to just remove the %preun > > > > scripts completely, but I'm not sure whether that's correct for Plasma or in > > > > general. > > > > > > For actual services, stopping on removal make sense to me, even if there is > > > a package rename. > > > > Why? I can't think of a single case where this makes sense. A package rename > > is simply not removal and should be as close to a noop as possible. > > I'll admit, there are lots of scenarios one might be thinking about. What I > was thinking about is where the service is also renamed along with the > package. Right, that's a different beast. Here services moved (unmodified) between packages. > > > I think the problem here is that these are not real > > > services, they are parts of a session and KDE abuses the service mechanism, > > > > Plasma follows https://systemd.io/DESKTOP_ENVIRONMENTS/, like other DEs. > > Yeah, I'm aware of that. But the fact that there is now an official document > for it doesn't change my view that systemd services are a bad fit for most > parts of a session. The first section already points out an obvious > limitation of this, because there can only be one user@.service per user, > and hence only one session. (Not saying that I need multiple graphical > sessions, but it's telling that this is impossible with such a design.) That was already broken by the introduction of a single session dbus (/run/user/UID/bus). In the past, Plasma always started a session-specific dbus-daemon but that caused other issues. (Also kind of OT here) > This is slightly off-topic, I'm just saying that these are not services in > any conventional sense. Or if they are, then "service" has just become a > meaningless word. (If everything is a service, then what does being a > service mean?) > > So whatever we do for services has no bearing here, and should be > reevaluated separately. > > > Previously the whole session was effectively part of display-manager.service. > > That must have been a long time ago. In my earliest memories, the whole > session was under a session-*.scope. That should have been the case since > the introduction of systemd-logind. At least parts of the session still are > there. This is still the case, but the scope unit's main process is managed by the display-manager. > > > presumably to get separate systemd cgroups for these processes. (Though one > > > could argue that the issue is with systemd not allowing to subdivide a > > > session scope.) > > > > systemd allows scope creation, Plasma already does that for launching of > > applications. > > It does, but not under session-*.scope. Generally I think scopes can't be > nested. Note that systemd system and systemd user sides are different. On the system side, there are session-SESSIONID.scope and user@UID.service. On the user side, there are plasma-plasmashell.service, app-yakuake@autostart.service, app-org.kde.konsole-814d1bdc13124a13981dfbf562b77606.scope, etc.
Workaround is in TW for some time now, so moving the bug over to systemd to hopefully find a way to address the root cause.