Bugzilla – Bug 117248
shutdown starts after resume from ram
Last modified: 2007-06-05 11:20:31 UTC
with both RC1 and beta-4 I have a strange new behaviour with suspend to ram with beta-3 it seemed worked fine): now after the 2nd resume seems to be finished and display is ok, suddenly a shutdown gets started. why ?? in /var/log/messages I always (it's reproducable) some messages of "gconfd". is gconfd switching to runlevel 0 ? suspend-to-disk does not trigger that shutdown and I don't see any gconfd messages. after 4 times "powersave -U" with no problem/shutdown the 2nd "powersave -u" again triggered a shutdown. this is _very_ annoying_ because it kills all my open apps. right now I can't reliably suspend/resume to ram with SUSE 10.0 :-( which other information do you need ? Sep 14 21:25:42 harald kernel: ACPI: PCI interrupt for device 0000:00:07.2 disabled Sep 15 10:14:50 harald kernel: Stopping tasks: ===================================================================| Sep 15 10:14:50 harald kernel: ACPI: PCI interrupt for device 0000:00:0b.1 disabled Sep 15 10:14:50 harald kernel: ACPI: PCI interrupt for device 0000:00:0b.0 disabled Sep 15 10:14:50 harald kernel: ACPI: PCI interrupt for device 0000:00:00.1 disabled Sep 15 10:14:50 harald kernel: Back to C! Sep 15 10:14:50 harald kernel: ACPI: PCI Interrupt 0000:00:00.1[B] -> Link [LNKB] -> GSI 11 (level, low) -> IRQ 11 Sep 15 10:14:50 harald kernel: PCI: Setting latency timer of device 0000:00:00.1 to 64 Sep 15 10:14:50 harald kernel: ACPI: PCI Interrupt 0000:00:0b.0[A] -> Link [LNKA] -> GSI 11 (level, low) -> IRQ 11 Sep 15 10:14:50 harald kernel: ACPI: PCI Interrupt 0000:00:0b.1[B] -> Link [LNKB] -> GSI 11 (level, low) -> IRQ 11 Sep 15 10:14:50 harald kernel: eth0: link down Sep 15 10:14:50 harald kernel: Restarting tasks... done Sep 15 10:14:50 harald kernel: USB Universal Host Controller Interface driver v2.3 Sep 15 10:14:50 harald kernel: ACPI: PCI Interrupt 0000:00:07.2[D] -> Link [LNKD] -> GSI 11 (level, low) -> IRQ 11 Sep 15 10:14:50 harald kernel: uhci_hcd 0000:00:07.2: UHCI Host Controller Sep 15 10:14:50 harald kernel: uhci_hcd 0000:00:07.2: new USB bus registered, assigned bus number 1 Sep 15 10:14:50 harald kernel: uhci_hcd 0000:00:07.2: irq 11, io base 0x0000f840 Sep 15 10:14:50 harald kernel: hub 1-0:1.0: USB hub found Sep 15 10:14:50 harald kernel: hub 1-0:1.0: 2 ports detected Sep 15 10:14:50 harald kernel: Initializing USB Mass Storage driver... Sep 15 10:14:50 harald kernel: usbcore: registered new driver usb-storage Sep 15 10:14:50 harald kernel: USB Mass Storage support registered. Sep 15 10:14:51 harald kernel: eth0: link up, 100Mbps, full-duplex, lpa 0x41E1 Sep 15 10:14:51 harald su: (to root) root on none Sep 15 10:15:04 harald su: (to root) root on none Sep 15 10:15:07 harald gconfd (root-31459): starting (version 2.11.92), pid 31459 user 'root' Sep 15 10:15:07 harald gconfd (root-31459): Resolved address "xml:readonly:/etc/opt/gnome/gconf/gconf.xml.mandatory" to a read-only configuration source at po\ sition 0 Sep 15 10:15:07 harald gconfd (root-31459): Resolved address "xml:readwrite:/root/.gconf" to a writable configuration source at position 1 Sep 15 10:15:07 harald gconfd (root-31459): Resolved address "xml:readonly:/etc/opt/gnome/gconf/gconf.xml.defaults" to a read-only configuration source at pos\ ition 2 Sep 15 10:15:10 harald init: Switching to runlevel: 0 Sep 15 10:15:13 harald hidd[4705]: Exit Sep 15 10:15:13 harald sdpd[4658]: terminating... Sep 15 10:15:13 harald hcid[4653]: Exit. Sep 15 10:15:14 harald smartd[4941]: smartd received signal 15: Terminated Sep 15 10:15:14 harald smartd[4941]: smartd is exiting (exit status 0) Sep 15 10:15:15 harald smpppd[5054]: terminating on signal 15 Sep 15 10:15:15 harald sshd[5042]: Received signal 15; terminating. Sep 15 10:15:19 harald kernel: Kernel logging (proc) stopped. Sep 15 10:15:19 harald kernel: Kernel log daemon terminating. Sep 15 10:15:20 harald exiting on signal 15 Sep 15 14:06:15 harald syslogd 1.4.1: restart.
ACPI. Please set the DEBUG level to 31 in /etc/sysconfig/powersave/common, restart powersaved (maybe rotate /var/log/messages first), try it again. If it happens again, "grep powersave /var/log/messages > /tmp/messages" and attach /tmp/messages and /var/log/acpid.log
this may be a fallout of the "button events are reported late" bug. Timo, Thomas, do you have a pointer to that? Maybe the button event is reported some 25 seconds late, at which time powersaved of course no longer ignores it (we ignore button events directly after resume until resume cleanup is finished).
Created attachment 50047 [details] syslogd and acpid log files these are complete log files for the following session: boot system powerdown -u close LCD open LCD (this issues resume) powerdown -u close LCD open LCD (this issues resume) after X11 is up again, the unwanted halt takes place. before (not included in this log, I did a clean reboot after that first test) I've tried to resume with power-button 2-3 times (LCD always open) but this time it did not trigger the shutdown:-( at least in beta-4 I'm sure I had this shutdowns too with only "powerdown -u" and resume with power-button (not touching the LCD close/open switch). does this help ? any more details needed ? thanks!
(In reply to comment #2) > this may be a fallout of the "button events are reported late" bug. Timo, > Thomas, do you have a pointer to that? Maybe the button event is reported some > 25 seconds late, at which time powersaved of course no longer ignores it (we > ignore button events directly after resume until resume cleanup is finished). if this is a timing issue, the following information might be important: I'm using a Toshiba Portege 3480 notebook, which uses a P-III 600 MHz with only 192 MB ram, so that's not a fast machine on modern scales. BUT the suspend/resume from/to ram takes much less than those 25 seconds mentioned above...
(In reply to comment #4) > if this is a timing issue, the following information might be important: > > I'm using a Toshiba Portege 3480 notebook, which uses a P-III 600 MHz with only > 192 MB ram, so that's not a fast machine on modern scales. this probably does not matter in this case, see below > BUT the suspend/resume from/to ram takes much less than those 25 seconds > mentioned above... the problem is: Many machine generate a "button/power" event on resume from suspend to RAM, regardless of them being resumed by lid open or powerbutton or a simple keypress. Because of this, we ignore button/power events directly after suspend, as long as the restore stuff / module reloading etc is not finished yet. There is another bug which i referred to, that causes acpi buttons to be reported late (always or sometimes randomly). If your button event from suspend now happens to be delayed for ~1 minute (example), then the resume will be finished. I timed the 25 seconds from your first log, from the time the usb modules were loaded to the time the system went "init 0". we could simply ignore the first button event after resume, no matter how long it takes, but this would break those machines that do not generate such a button event on resume and thus would also be wrong. A work around for now might be to unload the button module before suspend and reload after resume (although the delayed event may still be delivered, i am not sure) or to simply tell powersaved to ignore the power button, unfortunately you cannot use it to shutdown the machine then :-( I will look into the verbose log (don't forget to reset the debug level to something <=15 or it will spam your logs very badly :-) but i think that my guess of the reason is pretty accurate.
ok. Holger, we have the following: from acpid: [Thu Sep 15 18:14:52 2005] received event "button/power PWRF 00000080 00000002" [Thu Sep 15 18:14:52 2005] notifying client 5172[0:0] [Thu Sep 15 18:14:52 2005] notifying client 5211[0:0] [Thu Sep 15 18:14:52 2005] completed event "button/power PWRF 00000080 00000002" from powersaved (excerpt): Sep 15 18:14:14 compaq [powersave]: DIAG (ignore_buttons:1538) ignore_buttons is set to: 1 Sep 15 18:14:14 compaq [powersave]: Info (sleep_wrapper:1420) Set machine into suspend2disk/ suspend2ram mode Sep 15 18:14:52 compaq [powersave]: Info (sleep_wrapper:1436) Back from suspend2ram inside sleep_wrapper Sep 15 18:14:53 compaq [powersave]: Debug (start:390) Select waiting: 36 secs and 52 millisecs Sep 15 18:14:53 compaq [powersave]: Info (start:440) Handling dbus request on fd: 7 Sep 15 18:14:53 compaq [powersave]: Info (checkScriptReturn:1334) SCRIPT returned:, Name: restore_after_suspend_to_ram, Return value: 0, Comment: restore_after_suspend2ram finished Sep 15 18:14:53 compaq [powersave]: DIAG (ignore_buttons:1538) ignore_buttons is set to: 0 Sep 15 18:14:53 compaq [powersave]: Debug (start:390) Select waiting: 36 secs and 52 millisecs Sep 15 18:14:53 compaq [powersave]: Info (start:440) Handling dbus request on fd: 7 Sep 15 18:14:53 compaq [powersave]: Debug (start:390) Select waiting: 36 secs and 52 millisecs Sep 15 18:14:53 compaq [powersave]: Debug (start:492) hw FD 6 is set, handling it ... Sep 15 18:14:53 compaq [powersave]: DIAG (handleHWEventRequest:129) ACPI Event: 'button/power PWRF 00000080 00000002' ignore buttons: no Sep 15 18:14:53 compaq [powersave]: Debug (handleHWEventRequest:142) type: button/power, dev_name: PWRF, port: 00000080, count: 00000002 Sep 15 18:14:53 compaq [powersave]: Info (handleHWEventRequest:194) button power event occured I think this leads to a logical problem and a race condition, i will check the code and then come back with a conclusion. The ~20 seconds i saw are simply due to first trying to log out KDE/GNOME/whatever and after that just plain calling shutdown
(In reply to comment #5) > the problem is: Many machine generate a "button/power" event on resume from > suspend to RAM, regardless of them being resumed by lid open or powerbutton or a > simple keypress. Because of this, we ignore button/power events directly after > suspend, as long as the restore stuff / module reloading etc is not finished yet. > > There is another bug which i referred to, that causes acpi buttons to be > reported late (always or sometimes randomly). If your button event from suspend > now happens to be delayed for ~1 minute (example), then the resume will be > finished. I timed the 25 seconds from your first log, from the time the usb > modules were loaded to the time the system went "init 0". > > we could simply ignore the first button event after resume, no matter how long > it takes, but this would break those machines that do not generate such a button > event on resume and thus would also be wrong. > > A work around for now might be to unload the button module before suspend and > reload after resume (although the delayed event may still be delivered, i am not > sure) or to simply tell powersaved to ignore the power button, unfortunately you > cannot use it to shutdown the machine then :-( yep, obviously not the right way to go. thanks for the details! a quick help for me might be to set EVENT_BUTTON_POWER="suspend_to_ram" this is what I'd like as default anyway, but didn't look for it until now. with "suspend_to_ram" unstead of "wm_shutdown" I have to resume twice as worst case, but I won't loose my whole session any more... one more questions about powersave/events: where can I find information which process/script/config files controls what happens for those events, e.g. what gets started for e.g. "wm_shutdown" in the first case, where can I change that action ? I'm asking because right now my X11 display flashes once when I close/open the LCD. in the evens file I now found EVENT_BUTTON_LID_CLOSED="screen_saver" but right now there seems to be no screen saver running ?! I'd like to trigger a "xlock -mode blank" for EVENT_BUTTON_LID_CLOSED (or better for the action "screen_saver") before SUSE 10 I only could use APM on this old Toshiba, and there I used APMD_LOCK_MODE="blank" in powermanagement -- that's what I'd like to have again with powersaved and ACPI. btw, I'm using xfce2 (or before fvwm2), KDE is not an option, esp. not with 192MB ram (but neither with 1GB ;-) > I will look into the verbose log (don't forget to reset the debug level to > something <=15 or it will spam your logs very badly :-) but i think that my > guess of the reason is pretty accurate. thanks!
(In reply to comment #6 > I think this leads to a logical problem and a race condition, i will check the > code and then come back with a conclusion. > > The ~20 seconds i saw are simply due to first trying to log out > KDE/GNOME/whatever and after that just plain calling shutdown one more timing detail: if this unwanted shutdown gets issued, then it's _always_ started almost immedieately (only 1, maybe 2 seconds) after the resume is complete and X11 looks ok to me. at least there is no "random timeout" after which one more button event gets delivered...
it is a race condition: we have a select waiting on 2 filedescriptors: - acpid - dbus if select falls through, it checks first for dbus, then for acpid. If enough dbus messages are in the queue, then the "button ignore" flag is reset before the acpid event arrives. I'm building a test package for you to test right now, it should appear shortly on ftp.suse.com:/pub/people/seife/powersave/SUSE-10.0-test/10.0-i386/powersave-0.10.15-5.i586.rpm (there is more than one version in this directory, make sure you use the latest :-). Please reassign "wm_shutdown" or "notify" or something like that to the power button, so that you actually see if it is still happening (it should not ;-) ). Thanks for reporting and helping with debugging this
Created attachment 50062 [details] handle acpid events before dbus events Holger, just for reference the patch i put into the package. If we get a success report, i'll commit this to 10.0 branch and HEAD.
Ok, looks good to me. For futher versions, this problem might disappear anyway because of glib integration. Thank you both for debugging.
(In reply to comment #10) > Created an attachment (id=50062) [edit] > handle acpid events before dbus events > > Holger, just for reference the patch i put into the package. If we get a > success report, i'll commit this to 10.0 branch and HEAD. SUCCESS!! at least for the first 10+ resumes with both power-switch and LCD close/open I never got a shutdown anymore (I changed EVENT_BUTTON_POWER and restarted powersaved before;) and to my surprise, now the xlock gets activated too for the first time! great, almost... ;-) starting xlock without KDE takes a long time, see my new report & fix #117434 thanks!
ok, i'll check this fix into CVS. We will wait for other pending fixes and will probably issue an online update once GM is out.