Bug 802525 - dbus hung on rtkit upgrade
Summary: dbus hung on rtkit upgrade
Status: RESOLVED FIXED
: 802762 803364 842199 842205 845062 (view as bug list)
Alias: None
Product: openSUSE 13.1
Classification: openSUSE
Component: Basesystem (show other bugs)
Version: RC 1
Hardware: x86-64 SUSE Other
: P2 - High : Major (vote)
Target Milestone: RC 1
Assignee: E-mail List
QA Contact: E-mail List
URL:
Whiteboard: GOLD
Keywords:
Depends on:
Blocks:
 
Reported: 2013-02-07 07:07 UTC by Bernhard Wiedemann
Modified: 2014-05-26 15:28 UTC (History)
17 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---
coolo: SHIP_STOPPER-


Attachments
processes.txt (914 bytes, text/plain)
2013-02-13 03:00 UTC, Hans Petter Jansson
Details
trace.txt (2.55 KB, text/plain)
2013-02-13 03:05 UTC, Hans Petter Jansson
Details
dbus-fall-back-to-old-run-directory.patch (9.10 KB, patch)
2013-02-21 19:22 UTC, Hans Petter Jansson
Details | Diff
dbus-fall-back-to-old-run-directory.patch (9.37 KB, patch)
2013-02-22 17:16 UTC, Hans Petter Jansson
Details | Diff
dbus-fall-back-to-old-run-directory.patch (11.70 KB, patch)
2013-07-10 01:26 UTC, Hans Petter Jansson
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Bernhard Wiedemann 2013-02-07 07:07:43 UTC
User-Agent:       Mozilla/5.0 (X11; Linux x86_64; rv:18.0) Gecko/20100101 Firefox/18.0

dbus hung on rtkit upgrade from Beta

Reproducible: Always

Steps to Reproduce:
1. have 12.3-Beta
2. run zypper up

Actual Results:  
installtion of rtkit blocked forever in %post script
 with ps axf showing a dbus-send --system --type=method_call --dest=org.freedesktop.DBus / org.freedesktop.DBus.ReloadConfig
and firefox would not start either then.
rcdbus restart made it unblock

Expected Results:  
upgrade should be possible without user-interaction
Comment 1 Rich Coe 2013-02-08 17:31:29 UTC
*** Bug 802762 has been marked as a duplicate of this bug. ***
Comment 2 Bernhard Wiedemann 2013-02-09 20:51:38 UTC
I found interesting openQA results:
http://openqa.opensuse.org/results/openSUSE-12.3-NET-i586-Build0034-12.1gnome32zdup

shows that upgrade from 12.1 with zypper dup also is affected now
in addition to 12.2 and 12.3-Beta

and
http://openqa.opensuse.org/results/openSUSE-DVD-i586-Build0370-12.1gnome32zdup

showing that it worked with Factory repos on 2013-02-02 with 
rtkit-0.11_git201205151338-2.3.i586
dbus-1-1.6.8-1.2.i586

so I wonder what changed recently... last thing in dbus-1 changelog is
dbus-move-everything-to-run.patch
Comment 3 Robert Milasan 2013-02-10 10:19:27 UTC
The only why I can see this as an issue with that patch is that when updating/upgrading rtkit the chroot env doesn't mount /run, which should happen.

Normally it shouldn't be any issue as /var/run is anyways bind mounted to /run, so would be need it.
Comment 4 Greg Freemyer 2013-02-10 15:40:00 UTC
I had rtkit %post hang during a 12.2 => 12.3 RC1 upgrade test via zypper dup.

cntrl-c after the hang got the below output.
==
Installing: rtkit-0.11_git201205151338-3.1.1 ....................................<100%>[|]

Installing: rtkit-0.11_git201205151338-3.1.1 ......................................[error]
Installation of rtkit-0.11_git201205151338-3.1.1 failed:
(with --nodeps --force) Error: Subprocess failed. Error: RPM failed: warning: %post(rtkit-0.11_git201205151338-3.1.1.i586) scriptlet failed, signal 2
==

I don't know if it is related, but after a reboot to try and allow the above to finish, I network manager would not start.  For the typical user, that could leave them with a dead machine.  (I haven't started to troubleshoot that yet).
Comment 5 Stanislav Brabec 2013-02-12 16:07:42 UTC
I just reproduced the same problem.

Had openSUSE 12.3 beta 1, called zypper dup today, and the update process hangs on dbus-send. Called in live system, not chroot.
Comment 6 Robert Milasan 2013-02-12 16:53:57 UTC
Well because of this I don't believe is the move to /run the actual issue, but we can drop the patch from dbus and... see what happens.
Comment 7 Hans Petter Jansson 2013-02-12 21:10:24 UTC
I've yet to reproduce this.

If it can be reproduced reliably on a live system, we might be able to install just the single package with RPM under strace and see what it's blocking on. With debug symbols installed, attaching gdb to the hanging dbus-send process might also be useful.
Comment 8 Hans Petter Jansson 2013-02-13 00:47:50 UTC
I was running a zypper dup from an earlier Factory version to current, and unfortunately, gnome-shell locked itself in the meantime. It now hangs when I try to authenticate. Console login hangs for about half a minute, then fails.

This could be due to the same bug, or something different.

I'm going to do the dup again and make sure I retain a serviceable terminal to debug on.
Comment 9 Hans Petter Jansson 2013-02-13 02:58:21 UTC
I managed to reproduce the hang just now. Attaching list of running dbus processes, and a gdb backtrace of hanging dbus-send. It seems to be hanging on system bus acquisition.
Comment 10 Hans Petter Jansson 2013-02-13 03:00:24 UTC
Created attachment 524392 [details]
processes.txt

List of running D-Bus processes.
Comment 11 Hans Petter Jansson 2013-02-13 03:05:16 UTC
Created attachment 524393 [details]
trace.txt

Backtrace of stuck dbus-send.
Comment 12 Scott Reeves 2013-02-13 07:42:29 UTC
Reproduced the dbus-send hang twice starting with a 12.2 VM then zypper dup'ing to factory. Rebuilt dbus-1 with the "move-everything-to-run" patch disabled, added it to the repo, ran zypper dup and the upgrade completed. Appears quite likely that patch is triggering the hang. Needs some dbus investigation.
Comment 13 Robert Milasan 2013-02-13 10:05:08 UTC
OK, lets drop then the patch, but we would need a fix for the /var/run bind issue. Seems like some apps are started before the actual binding of /run to /var/run happens.

Tried some tests with adding After=var-run.mount in systemd, but didn't help.

Frederic, what do you think, how should we fix the /var/run issue?
Comment 14 Kamil Dziedzic 2013-02-16 16:09:18 UTC
My update to 12.3 just stopped on rtkit too.
Advice from bug report 'rcdbus restart made it unblock' doesn't work for me.

#rcdbus restart
redirecting to systemctl  restart dbus
Failed to issue method call: Unit name dbus is not valid.

:/ Any idea how can I go further?:)
Comment 15 Kamil Dziedzic 2013-02-16 16:29:03 UTC
Installing: rtkit-0.11_git201205151338-3.1.1 ..........................................................................................................................................<100%>[|]
<ctrl+c>
Installing: rtkit-0.11_git201205151338-3.1.1 ............................................................................................................................................[error]
Installation of rtkit-0.11_git201205151338-3.1.1 failed:
(with --nodeps --force) Error: Subprocess failed. Error: RPM failed: warning: %post(rtkit-0.11_git201205151338-3.1.1.x86_64) scriptlet failed, signal 2


Abort, retry, ignore? [a/r/i] (a): r
Installing: rtkit-0.11_git201205151338-3.1.1 .............................................................................................................................................[done]
Comment 16 Greg Freemyer 2013-02-16 21:47:38 UTC
(In reply to comment #14)
> My update to 12.3 just stopped on rtkit too.
> Advice from bug report 'rcdbus restart made it unblock' doesn't work for me.
> 
> #rcdbus restart
> redirecting to systemctl  restart dbus
> Failed to issue method call: Unit name dbus is not valid.
> 
> :/ Any idea how can I go further?:)

I just rebooted and reran "zypper dup".  In my case I was also hit with the network not starting bug, so I had to run "NetworkManager" from the kde konsole to get my network online, the run "zypper dup".
Comment 17 Stephan Kulow 2013-02-17 16:03:31 UTC
this bug is ugly enough (experienced it myself) to call it  ship stopper. Most likely everyone doing zypper dup will notice it
Comment 18 Frederic Crozat 2013-02-18 13:02:29 UTC
First, dbus system service should never be restarted (apps won't support it and most of them will crash).

Second, I'm afraid we have a inconsistency until dbus is restarted, because dbus library will try to connect to system bus using the new address for the unix socket used by system bus and system bus will still be only listening to the old socket path (I might be wrong..).

If we can't do better, we should revert the move-everything-patch (and get /var/run replaced with a symlink but for 13.1 ).
Comment 19 Robert Milasan 2013-02-18 13:05:15 UTC
I agree with Frederic, lets revert the patch to fix the issue for 12.3 and we shall see what will happen in 13.1.
Comment 20 Frederic Crozat 2013-02-18 13:11:36 UTC
please note that removing the patch will cause the same blocking behavior for people upgrading from dbus with the patch to dbus without the patch (ie RC1 to later).
Comment 21 Robert Milasan 2013-02-18 13:13:38 UTC
OK then what do we do? I've already submitted the change, but I can always stop it.
Comment 22 Frederic Crozat 2013-02-18 13:18:56 UTC
we do it, it is just a warning for RC1 => RC2, that's all :)
Comment 23 Robert Milasan 2013-02-18 13:46:46 UTC
BTW, can't we add to dbus.service and/or dbus.socket some After=var-run.mount var-lock.mount ?
Comment 24 Hans Petter Jansson 2013-02-18 14:46:13 UTC
(In reply to comment #18)

> Second, I'm afraid we have a inconsistency until dbus is restarted, because
> dbus library will try to connect to system bus using the new address for the
> unix socket used by system bus and system bus will still be only listening to
> the old socket path (I might be wrong..).

netstat confirms this. The old dbus-daemon is still listening on /var/run/dbus/system_bus_socket (both before and after the new dbus package is installed), while packages compiled with the new dbus try to talk to it at /run/dbus/system_bus_socket.

I could probably come up with a patch that makes clients try to connect to both /run/dbus/system_bus_socket and /var/run/dbus/system_bus_socket as a stopgap. It'd have to be in for as long as we want to support a dist-upgrade across this change. Would that be acceptable?
Comment 25 Frederic Crozat 2013-02-18 14:53:01 UTC
(In reply to comment #23)
> BTW, can't we add to dbus.service and/or dbus.socket some After=var-run.mount
> var-lock.mount ?

this is not needed:  dbus.socket has "After=sysinit.target" ; sysinit.target has "After=local-fs.target" and var-run.mount has "Before=local-fs.target", so we are garanteed the socket is created by systemd after /var/run has been bind mounted.
Comment 26 Robert Milasan 2013-02-18 20:40:07 UTC
Yes, but thats for dbus.service, what about dbus.socket:

[Unit]
Description=D-Bus System Message Bus Socket

[Socket]
ListenStream=/var/run/dbus/system_bus_socket

Isn't this created by systemd?

rmilasan@coolcat:/lib/systemd/system> sudo lsof |grep system_bus_socket
systemd       1             root   28u     unix 0xffff88021fbcbb40      0t0      13796 /var/run/dbus/system_bus_socket
Comment 27 Frederic Crozat 2013-02-19 07:26:41 UTC
(In reply to comment #26)
> Yes, but thats for dbus.service, what about dbus.socket:
> 
> [Unit]
> Description=D-Bus System Message Bus Socket
> 
> [Socket]
> ListenStream=/var/run/dbus/system_bus_socket
> 
> Isn't this created by systemd?

It is, but when you check systemctl show dbus.socket | grep After=, you'll notice sysinit.target which does pull var-run.mount (through local-fs.target).

So we are safe here.
Comment 28 Robert Milasan 2013-02-19 08:27:42 UTC
OK, thanks Fredeic.

@Hans: if you can do that, what you wrong in comment 24, it wouldn't be bad in my opinion.
Comment 29 Robert Milasan 2013-02-19 13:48:50 UTC
BTW, Frederic (sorry for writing you name the wrong way) how can we find out what was created in /var/run before it was binded to /run. Because I've checked dbus and you where right, checked cups and again the same, but the error still happens.
Comment 30 Frederic Crozat 2013-02-19 14:50:01 UTC
(In reply to comment #29)
> BTW, Frederic (sorry for writing you name the wrong way) how can we find out
> what was created in /var/run before it was binded to /run. Because I've checked
> dbus and you where right, checked cups and again the same, but the error still
> happens.

We need to move the bind mount to another directory to access the one underneath and check what is left there (maybe with mount --move but I'm not an expert here). Of course, a good test would be to wipe /var/run (on disk) before booting..
Comment 31 Bernhard Wiedemann 2013-02-19 15:00:22 UTC
This is an autogenerated message for OBS integration:
This bug (802525) was mentioned in
https://build.opensuse.org/request/show/155816 Factory / dbus-1
Comment 32 Robert Milasan 2013-02-19 15:02:00 UTC
Well I have the small impression that /var/run is being used within the initrd
part. This I found due to dmraid which was creating some lock file in /var/run.
Now dmraid is patched, so thats out of the question, but seems like there is
something else doing this.
Comment 33 Hans Petter Jansson 2013-02-19 23:28:41 UTC
I wrote a patch to make D-Bus clients try the sockets in both /run and /var/run. The patch itself works, but it doesn't solve the problem. dbus-daemon cannot be reached on either socket.

I noticed something, though: It looks like dbus-send is still working fine after upgrading to the dbus package that moves /var/run/dbus to /run/dbus. However, it stops working after I issue:

/bin/systemctl daemon-reload

The rtkit %post section indirectly calls this, and that seems to be where things start hanging.

Apologies if I'm behind the curve on this. From reading the bug, I have the impression we're still not clear on what exactly is causing the problem.
Comment 34 Stephan Kulow 2013-02-20 16:39:15 UTC
am I correct that we can remove the ship stopper flag now? the bug is fixed as reported, right?
Comment 35 Hans Petter Jansson 2013-02-20 19:15:05 UTC
(In reply to comment #34)
> am I correct that we can remove the ship stopper flag now? the bug is fixed as
> reported, right?

Confirmed -- with the dbus-1 packages in openSUSE:Factory that have the patch reverted, the problem no longer occurs for me.
Comment 36 Hans Petter Jansson 2013-02-20 19:24:34 UTC
The cause of the hang seems to be related to how systemd watches sockets. If you install the "broken" dbus-1 package and then then edit /usr/lib/systemd/system/dbus.socket to refer to the old socket in /var/run/ and not /run/, the hang will not occur.

So I guess the problem is that systemd expects the updated D-Bus daemon to listen on the /run/ socket, but it isn't, since it hasn't been -- and can't be -- restarted.

I can't say what an ideal fix for this would be, but if we could get the .socket file to refer to /var/run/ until D-Bus is restarted/the system is rebooted, and then changed that to /run/, that'd probably work. A symlink that gets replaced on startup would probably work too.
Comment 37 Frederic Crozat 2013-02-21 08:17:24 UTC
(In reply to comment #36)

> I can't say what an ideal fix for this would be, but if we could get the
> .socket file to refer to /var/run/ until D-Bus is restarted/the system is
> rebooted, and then changed that to /run/, that'd probably work. A symlink that
> gets replaced on startup would probably work too.

I'm not sure symlinks will play nice with sockets. You should shadow the system .socket by putting a fil in /run/systemd/system but I think this directory will be wiped out when you run systemctl daemon-reload (which is required for systemd to reload config files).
Comment 38 Hans Petter Jansson 2013-02-21 19:07:42 UTC
(In reply to comment #37)

> I'm not sure symlinks will play nice with sockets. You should shadow the system
> .socket by putting a fil in /run/systemd/system but I think this directory will
> be wiped out when you run systemctl daemon-reload (which is required for
> systemd to reload config files).

Thanks! That got me on the right track. Experimentally, /run/systemd/system/dbus.socket will persist across systemctl daemon-reload. Combined with my libdbus patch to fall back to /var/run/dbus/system_bus_socket, this makes "zypper install rtkit" and dbus-send not hang.

The dbus-1.spec file will need something like this in the %post section:

%post
# Temporarily override the socket path systemd sees, so the running D-Bus process won't be clobbered.
/bin/mkdir -p /run/systemd/system
/usr/bin/sed 's#ListenStream=/run/dbus/system_bus_socket#ListenStream=/var/run/dbus/system_bus_socket#' < /usr/lib/systemd/system/dbus.socket > /run/systemd/system/dbus.socket
Comment 39 Hans Petter Jansson 2013-02-21 19:22:49 UTC
Created attachment 525839 [details]
dbus-fall-back-to-old-run-directory.patch

Patch required to make libdbus consumers fall back to connecting to /var/run/dbus/system_bus_socket.

It's a little kludgy, but works ok in practice. A slightly improved version might alternate between /run and /var/run with a falloff, until either responds.
Comment 40 Hans Petter Jansson 2013-02-21 23:45:47 UTC
Frederic, Robert: Is it worth making another try, or is it too late in the process? Should I submit it?
Comment 41 Frederic Crozat 2013-02-22 09:49:30 UTC
(In reply to comment #40)
> Frederic, Robert: Is it worth making another try, or is it too late in the
> process? Should I submit it?

I'd say go for it, but Coolo might disagree.
Comment 42 Robert Milasan 2013-02-22 09:52:40 UTC
Yes, I would go with this too and I would prefer to fully move to /run with dbus if this patch can do what we need.
Comment 43 Hans Petter Jansson 2013-02-22 17:15:05 UTC
Submitted with sr#156134.

I tested it by installing the dbus packages I built, locking them in with zypper al, and doing a dup. The upgrade ran to completion, but I was unable to boot afterwards. The error seemed to be caused by something else, though (modprobe was crashing the first time, and later on I got what looked like a kernel warning telling me to upgrade my BIOS).

Just upgrading dbus and rebooting worked fine.
Comment 44 Hans Petter Jansson 2013-02-22 17:16:31 UTC
Created attachment 526127 [details]
dbus-fall-back-to-old-run-directory.patch

I amended the patch to alternate between the potential socket paths with a falloff. This is the patch that was actually submitted.
Comment 45 Hans Petter Jansson 2013-02-25 18:34:14 UTC
I think the risk is fairly small. Stephan, what do you think? Should this go in?
Comment 46 Stephan Kulow 2013-02-25 18:55:45 UTC
if you three agree and tested it, I might take the risk if we include it in RC2
Comment 47 Robert Milasan 2013-02-26 09:20:56 UTC
I agree and I believe it should be added (the patch of course).
Comment 48 Hans Petter Jansson 2013-02-26 15:51:51 UTC
Resubmitted as sr#156443 - I had entered the wrong bug number in the submit request.
Comment 49 Frederic Crozat 2013-02-27 08:37:24 UTC
I tested hpj package while upgrading a 12.1 to 12.3 (factory-snapshot + updates) and there was no hung.

I say go for it.
Comment 50 Bernhard Wiedemann 2013-03-02 12:00:08 UTC
This is an autogenerated message for OBS integration:
This bug (802525) was mentioned in
https://build.opensuse.org/request/show/157099 Factory / dbus-1
Comment 51 Robert Milasan 2013-03-14 09:44:10 UTC
I think we can close this as RESOLVED/FXIED. If the problem occurs again, please re-open.
Comment 52 Dominique Leuenberger 2013-03-29 20:04:17 UTC
*** Bug 803364 has been marked as a duplicate of this bug. ***
Comment 53 Forgotten User DV81ZEWZkN 2013-06-22 12:08:21 UTC
(In reply to comment #38)
> The dbus-1.spec file will need something like this in the %post section:
> 
> %post
> # Temporarily override the socket path systemd sees, so the running D-Bus
> process won't be clobbered.
> /bin/mkdir -p /run/systemd/system
> /usr/bin/sed
> 's#ListenStream=/run/dbus/system_bus_socket#ListenStream=/var/run/dbus/system_bus_socket#'
> < /usr/lib/systemd/system/dbus.socket > /run/systemd/system/dbus.socket

This now creates similar situation, at least with current Factory. Try to install/upgrade systemd and dbus in the same run, and one then gets e.g. logind service, polkit, etc hang.
Do we need this really in the %post now? Upgrades work correctly with the original ListenStream.
Comment 54 Bernhard Wiedemann 2013-07-05 07:36:36 UTC
Upgrading with zypper dup from 12.3 to current Factory
still hung in rtkit. rcdbus restart helped
Comment 55 Frederic Crozat 2013-07-05 12:54:15 UTC
confirming. I've check the dbus-send call and it was listening on '/run/dbus/system_bus_socket", not the socket in "/var/run/dbus/system_bus_socket".

I'm wondering if we should put hpj fallback patch, otherwise, we will never be able to migrate properly..
Comment 56 Forgotten User DV81ZEWZkN 2013-07-05 13:00:09 UTC
(In reply to comment #55)
> I'm wondering if we should put hpj fallback patch, otherwise, we will never be
> able to migrate properly..

That may be true, however, that will also create hangs for everyone on > 12.3
Comment 57 Frederic Crozat 2013-07-05 13:15:13 UTC
(In reply to comment #56)
> (In reply to comment #55)
> > I'm wondering if we should put hpj fallback patch, otherwise, we will never be
> > able to migrate properly..
> 
> That may be true, however, that will also create hangs for everyone on > 12.3

hmm, no, the point was to use /run/dbus/.. and if it doesn't respond, fallback to /var/run/dbus/...
Comment 58 Forgotten User DV81ZEWZkN 2013-07-05 13:19:51 UTC
(In reply to comment #57)
> hmm, no, the point was to use /run/dbus/.. and if it doesn't respond, fallback
> to /var/run/dbus/...

Well the patch is already/still applied in Factory. Cheating in spec file with replacing ListenStream was removed recently, due to issue i mentioned
Comment 59 Frederic Crozat 2013-07-05 13:49:01 UTC
Hans, could you have a look at it ? Thanks !
Comment 60 Hans Petter Jansson 2013-07-09 01:07:53 UTC
I see the problem dup'ing from 12.3 to Factory. I'm looking into it.
Comment 61 Hans Petter Jansson 2013-07-09 20:08:53 UTC
Dup'ing from 12.3 to Factory, it looks like the rtkit upgrade is hanging again. dbus-daemon is back to listening on the socket in /var/run for some reason, but dbus-send is correctly trying to connect to the socket in /run.

Hrvoje: Are you seeing the same? Any idea what's causing it?

Since systemd is not listening on the socket in /run on behalf of dbus-daemon either, the connect() call is hanging in dbus-send. I can probably make it time out the connect() call in addition to the post-connect bus registration...
Comment 62 Forgotten User DV81ZEWZkN 2013-07-09 20:29:16 UTC
(In reply to comment #61)
> Dup'ing from 12.3 to Factory, it looks like the rtkit upgrade is hanging again.
> dbus-daemon is back to listening on the socket in /var/run for some reason, but
> dbus-send is correctly trying to connect to the socket in /run.
> 
> Hrvoje: Are you seeing the same? Any idea what's causing it?

What i was seeing, for some time in O:F (but not since the ListenStream sed was removed from spec), was that when at least when both dbus and systemd packages where upgraded/installed, the whole system hanged (polkit, upower, etc - dead), no salvation with rcdbus, just restarting with SysRQ. Haven't checked, but also could be that it happened with dbus upgrades alone, not just in combination with systemd.

Can't say what happens on 12.3 -> O:F at this point, but i trust all of you the hang has returned. But by just adding back the sed hack, we would have upgraders to O:F/13.1 not hang, and constant hangs for O:F/13.1 users each time when they would upgrade dbus (+ systemd).

> Since systemd is not listening on the socket in /run on behalf of dbus-daemon
> either, the connect() call is hanging in dbus-send. I can probably make it time
> out the connect() call in addition to the post-connect bus registration...

Not sure is this a (pretty) solution, but maybe we can have a conditional %post, which would check existance of xyz (which would we need to ship as 12.3 update), and then execute the sed command?
Comment 63 Hans Petter Jansson 2013-07-10 01:26:24 UTC
Created attachment 547343 [details]
dbus-fall-back-to-old-run-directory.patch

Revised patch. This seems to fix the problem for me, but I'd appreciate it if someone else would test it too.
Comment 64 Hans Petter Jansson 2013-07-10 01:28:30 UTC
With the new patch, we may be able to upgrade without the .spec file's sed job. I'll test upgrading from 12.2 later.

Also, there's a package with the new patch and changelog ready to go in home:hpjansson:bnc802525/dbus-1 .

Hrvoje: Would you mind testing this?
Comment 65 Forgotten User DV81ZEWZkN 2013-07-10 01:39:16 UTC
I'll test it ;-)
Comment 66 Hans Petter Jansson 2013-08-01 13:41:57 UTC
Hrvoje: Any luck? :)
Comment 67 Forgotten User DV81ZEWZkN 2013-08-01 13:59:00 UTC
(In reply to comment #66)
> Hrvoje: Any luck? :)

Sorry for delay, i'll test it asap - hopefully today. Though i can only test situation on factory, it would still need validation for 12.3 -> factory upgrade
Comment 68 Forgotten User DV81ZEWZkN 2013-08-01 23:02:17 UTC
OK, at least problem i descibed *does not* happen with your patch. As said, still left to check 12.3 -> O:F upgrade.

I can test downgrade dbus to 12.3 and then to your patched package if that would be safe...
Comment 69 Hans Petter Jansson 2013-08-02 09:13:42 UTC
Thanks for testing. I'm traveling, so I probably won't be able to test pre-12.3 -> O:F until august 10th at the earliest.

Note that it has to be pre-12.3 GM, so probably best to test with 12.2 or 12.3 beta 1.
Comment 70 Dirk Weber 2013-09-08 14:06:09 UTC
I just did a zypper dup of a fully updated 12.3 to factory of today (13.1M4"+").

The upgrade stalled after installation of rtkit when 1749 of 2203 packages were upgraded.

I could continue the upgrade after rebooting the system.

It means the error still exists in factory.
Comment 71 Rich Coe 2013-09-20 21:21:39 UTC
I did a zypper dup of fully updated 12.3 to factory and 
it's hung on  rtkit-0.11_git201205151338-4.4
Comment 72 Rich Coe 2013-09-20 21:22:45 UTC
The hung process:
dbus-send --system --type=method_call --dest=org.freedesktop.DBus / org.freedesktop.DBus.ReloadConfig
Comment 73 Bernhard Wiedemann 2013-09-25 16:00:34 UTC
This is an autogenerated message for OBS integration:
This bug (802525) was mentioned in
https://build.opensuse.org/request/show/200593 Factory / dbus-1
Comment 74 Forgotten User DV81ZEWZkN 2013-09-25 19:13:11 UTC
With the latest dbus (and factory) i now get:

Sep 25 21:01:58 shumarija systemd[1]: Failed to open private bus connection: Failed to connect to socket /run/dbus/system_bus_socket: Transport endpoint is already connected

and basically unusuable system -> please revert (or fix ;-) ASAP!

How it worked 2 months ago, and now it doesn't, i have no idea...
Comment 75 Hans Petter Jansson 2013-09-26 22:51:14 UTC
The patch from comment #63 is not yet in. I propose we wait and see if that fixes the problem. I don't think anyone's been able to test it in an actual upgrade setting yet.

I haven't seen the messages from comment #74 before. That could be a different manifestation of the same bug, or a different bug altogether.
Comment 76 Scott Reeves 2013-09-27 20:28:03 UTC
*** Bug 842199 has been marked as a duplicate of this bug. ***
Comment 77 Hans Petter Jansson 2013-10-02 11:20:11 UTC
(In reply to comment #74)
> With the latest dbus (and factory) i now get:
> 
> Sep 25 21:01:58 shumarija systemd[1]: Failed to open private bus connection:
> Failed to connect to socket /run/dbus/system_bus_socket: Transport endpoint is
> already connected
> 
> and basically unusuable system -> please revert (or fix ;-) ASAP!
> 
> How it worked 2 months ago, and now it doesn't, i have no idea...

Were you testing with the D-Bus from home:hpjansson:bnc802525/dbus-1 ? I haven't synced that branch with Factory for a while, so maybe it's just outdated?

What did you do to trigger the issue?
Comment 78 Forgotten User DV81ZEWZkN 2013-10-02 17:28:19 UTC
The issue from comment 74 was triggered by installing dbus from base:system (with the ammend change), also see http://lists.opensuse.org/opensuse-factory/2013-09/msg00702.html
Comment 79 Hans Petter Jansson 2013-10-03 19:46:32 UTC
I see. Thanks for backing it out. I'm out of ideas for how to work around this design issue, at least for now.
Comment 80 Bernhard Wiedemann 2013-10-04 17:00:14 UTC
This is an autogenerated message for OBS integration:
This bug (802525) was mentioned in
https://build.opensuse.org/request/show/202167 12.3 / dbus-1-x11+dbus-1
Comment 81 Marcus Meissner 2013-10-06 10:53:06 UTC
can someone review the submission from Hrvoje in https://build.opensuse.org/request/show/202167 please?

is this a partial solution for this problem here?
Comment 82 Forgotten User DV81ZEWZkN 2013-10-06 11:52:27 UTC
@Marcus,
my two sr's (the one for 12.3, and the one to Base:System) are doing pretty much what was done in sr#157099 (the ListenStream swap in %post). What is different is that i added one file in 12.3 sr, which makes the B:S package aware that the swap should be executed (that the old run path is used there), as there are issues with the upgrades for those that have already path at /run (those with 13.1 and newer) if the swap is executed unconditionally.
Comment 83 Ancor Gonzalez Sosa 2013-10-07 14:28:32 UTC
Just wanted you to know that I was also hit by this bug when trying a direct 'zypper dup' from 12.2 to 13.1 on real hardware.

I interrupted the hanged post-script, uninstalled rtkit (which meant uninstalling pulseaudio and no other dependency, in my case) and run "zypper dup" again.

The upgrade succeeded with no other blocking issue, although the system was not really usable while upgrading (impossible to open new user sessions, for example). After rebooting everything worked fine and I was able to install pulseaudio (and rtkit as a dependency) with no hassle.
Comment 84 Bernhard Wiedemann 2013-10-07 15:00:16 UTC
This is an autogenerated message for OBS integration:
This bug (802525) was mentioned in
https://build.opensuse.org/request/show/202540 Factory / dbus-1
Comment 85 Ancor Gonzalez Sosa 2013-10-11 13:13:56 UTC
Tested in 13.1RC1 (rtkit-0.11_git201205151338-5.2.1) in several systems and the bug is still there. rtkit simply hangs during post-script. If systemd is restarted in the meanwhile, the script manages to continue and the upgrade successes.

That bug will hit everybody running 'zypper dup' to upgrade from 12.X to 13.1RC1.
Comment 86 Forgotten User DV81ZEWZkN 2013-10-11 13:17:34 UTC
(In reply to comment #85)
> That bug will hit everybody running 'zypper dup' to upgrade from 12.X to
> 13.1RC1.

It will happen until DBus update is published for 12.3. Or people can manually create /var/lib/old_run_path file
Comment 87 Ancor Gonzalez Sosa 2013-10-11 13:20:11 UTC
*** Bug 845062 has been marked as a duplicate of this bug. ***
Comment 88 Swamp Workflow Management 2013-10-11 19:04:23 UTC
openSUSE-RU-2013:1544-1: An update that has one recommended fix can now be installed.

Category: recommended (moderate)
Bug References: 802525
CVE References: 
Sources used:
openSUSE 12.3 (src):    dbus-1-1.6.8-2.10.1, dbus-1-x11-1.6.8-2.10.1
Comment 89 Klaus Kämpf 2013-11-04 08:11:17 UTC
*** Bug 842205 has been marked as a duplicate of this bug. ***
Comment 90 Axel Braun 2013-11-04 08:14:01 UTC
On the weekend I did a new installation of 12.3 to test the upgrade to 13.1
That failed as well, but for a different reason. This dbus-Problem did not occur
Comment 91 Ancor Gonzalez Sosa 2013-11-04 08:27:38 UTC
I have been testing 'zypper dup' in a regular basis (in different machines) since Nov 10th and I'm pretty sure that the bug is solved now. Since Axel has also reported it as fixed, I'm closing the bug.
Comment 92 Forgotten User DV81ZEWZkN 2014-05-17 17:58:47 UTC
so with https://build.opensuse.org/request/show/234182 we will have this back.
@Fridrich is this really wanted? also quite a few packages rely on DBus' socket in /run...
Comment 93 Frederic Crozat 2014-05-19 10:43:45 UTC
(In reply to comment #92)
> so with https://build.opensuse.org/request/show/234182 we will have this back.
> @Fridrich is this really wanted? also quite a few packages rely on DBus' socket
> in /run...

Yes, it is wanted: systemd has changed back dbus socket from /run to /var/run to follow the D-Bus specification per the book where /var/run is hardcoded. In the future, when we switch to kdbus, /var/run/ will still be used. Therefore, dbus should be an exception to the move to /run.
Comment 94 Marcus Meissner 2014-05-19 11:05:21 UTC
funny that this is coming from the very same crowd that mandated the /run move in the beginning
Comment 95 Robert Milasan 2014-05-19 11:07:02 UTC
(In reply to comment #94)
> funny that this is coming from the very same crowd that mandated the /run move
> in the beginning

Yes, this is utterly stupid. Why have /run, when anyways we will use /var/run. Dumb!
Comment 96 Stephan Kulow 2014-05-19 11:16:45 UTC
kdbus is free to use /var/run, we have a symlink for it. But I don't see the value of putting everything there actively as we do have /run
Comment 97 Forgotten User DV81ZEWZkN 2014-05-19 11:18:26 UTC
(In reply to comment #93)
> Yes, it is wanted: systemd has changed back dbus socket from /run to /var/run
> to follow the D-Bus specification per the book where /var/run is hardcoded.

but it is not hardcoded. systemd should allow @system_bus_default_address@ configure switch, and not guess where the location is!
Comment 98 Stephan Kulow 2014-05-19 11:28:16 UTC
kdbus is the kernel and it has to hardcode *something*, it can't take it from random libraries. But be it: hardcode /var/run if it can't be kernel config, /var/run has to work anyway
Comment 99 Frederic Crozat 2014-05-19 12:38:05 UTC
http://dbus.freedesktop.org/doc/dbus-specification.html :

" The address of the system message bus is given in the DBUS_SYSTEM_BUS_ADDRESS environment variable. If that variable is not set, applications should try to connect to the well-known address unix:path=/var/run/dbus/system_bus_socket. [2] "

where [2] states:
"[2] The D-Bus reference implementation actually honors the $(localstatedir) configure option for this address, on both client and server side. "

The fact that the system socket path can be changed on dbus-1 is just an implementation detail which doesn't ensure other implementations (like the mono one) will use this same value.

Therefore the reason to switch back the system socket to /var/run/dbus...
Comment 100 Stephan Kulow 2014-05-19 12:57:22 UTC
you keep repeating the same thing - and it still doesn't make sense. if mono accesses /var/run/dbus it will get dbus, no matter how we configure dbus. So why would it make sense to not have all our software switch to /run and have a /var/run symlink for software that is not ours - or that's not worth patching?
Comment 101 Robert Milasan 2014-05-19 13:13:37 UTC
The symlink part was asking for quite a long time and never got the answer. Don't know why we implement what upstream does, doesn't matter if is dbus, systemd or whatever, but still have to screw with some little thing and make it totally problematic, like /var/run already mounted crap that still happens even now a days.