|
Bugzilla – Full Text Bug Listing |
| Summary: | suspend-to-disk without resume / error -12 suspending | ||
|---|---|---|---|
| Product: | [openSUSE] SUSE LINUX 10.0 | Reporter: | Kevin Ivory <Ivory> |
| Component: | Mobile Devices | Assignee: | Pavel Machek <pavel> |
| Status: | RESOLVED WONTFIX | QA Contact: | E-mail List <qa-bugs> |
| Severity: | Normal | ||
| Priority: | P5 - None | CC: | hare, jens-novell |
| Version: | RC 1 | ||
| Target Milestone: | --- | ||
| Hardware: | Other | ||
| OS: | All | ||
| Whiteboard: | |||
| Found By: | Other | Services Priority: | |
| Business Priority: | Blocker: | --- | |
| Marketing QA Status: | --- | IT Deployment: | --- |
| Attachments: |
hwinfo.txt
powersave_logs.tar.gz messages.gz messages.gz This might help/provide better diagnostics. suspend.jpg (picture before reboot) |
||
|
Description
Kevin Ivory
2005-08-29 17:18:06 UTC
Created attachment 48047 [details]
hwinfo.txt
Created attachment 48048 [details]
powersave_logs.tar.gz
try setting SUSPEND2DISK_SHUTDOWN_MODE="reboot" in /etc/sysconfig/powersave/sleep and try again. The machine will not shut down but reboot after suspend. If this works (resumes), then we have a disk shutdown problem (not all data is written out before powerdown) the system did not resume. I called powersave_logs after that once more. Is it needed? I'd need kernel messages just before the shutdown... Serial consore or digital camera... maybe we could try to inspect the swap partition from the rescue system after "suspend", try the following: - suspend (with SHUTDOWN_MODE="platform" or "shutdown", not "reboot") - boot the rescue system from CD - try "swapon /dev/hda1" if this swapon succeeds, something during suspend went wrong (the image was not correctly / completely written). If this swapon fails, something during resume goes wrong (resume from initramfs) I will collect all suggestions and act this evening. (I am taking a null modem cable home as well). my notebook doesn't have a serial interface and I didn't have a camera available. I am sorry I cannot provide the info at the moment since I installed Beta 4 - I was thinking, either the problem will still be there or not. Well, I didn't think there might be a different problem altogether. Now the system stops with what looks like a kernel oops (but doesn't say so). These are some lines that I copied per hand: swsusp: Need to copy 28628 pages Error -12 suspending --- [ cut here ]----- kernel BUG at kernel/power/swsusp.c:905! invalid operand: 0000 [#1] ....swsusp failed with -ENOMEM, and then something went wrong in error handling path. Are you sure you had swap enabled before trying to suspend? I really need the messages above that to know _why_ it failed with -ENOMEM. I don't know if this is related, but I had swsusp problems on a Pentium-M Centrino notebook with suse 9.3 as well. It would suspend ok, but on resume it would fail with "Stopping tasks: =" and then tell me it could not stop one task (kseriod), and then continue booting normally. With 10.0 RC1 both suspend and resume work, but it takes very long to resume. Most of the time is spent after the resume started, with a blank screen, and a constantly lit harddisk LED (I guess) swapping the running tasks back into RAM. But I had nothing big running, just a default KDE desktop with a couple konq windows. suspend: 42sec altogether resume: 60 sec altogether This works, but is not much faster than booting ... (In reply to comment #10) > I don't know if this is related, but I had swsusp problems on a Pentium-M > Centrino notebook with suse 9.3 as well. It would suspend ok, but on resume it > would fail with "Stopping tasks: =" and then tell me it could not stop one > task (kseriod), and then continue booting normally. this was (fixed) bug #74170, not sure if it is publicly accessible but should be fixed for 10.0 > With 10.0 RC1 both suspend and resume work, but it takes very long to resume. > Most of the time is spent after the resume started, with a blank screen, and a > constantly lit harddisk LED (I guess) swapping the running tasks back into > RAM. But I had nothing big running, just a default KDE desktop with a couple > konq windows. I have seen this, too. Also suspending takes a relatively long time in "freeing memory" so it might be a swapping performance regression, but has nothing to do with this bug in general. > This works, but is not much faster than booting ... So it works => different bug :-) i created bug #116313 for the possible swap performance problem I have my notebook at work today. The "kernel BUG" message is still there in RC-1.
swap was active (at least it was shown in "free")
All messages available before [cut here] line:
Stopping tasks: ===...
Freeing memory... done (25768 pages freed)
ACPI: PCI interrupt for device 0000:01:04.0 disabled
ACPI: PCI interrupt for device 0000:01:01.0 disabled
ACPI: PCI interrupt for device 0000:00:1f.5 disabled
ACPI: PCI interrupt for device 0000:00:1d.7 disabled
ACPI-0212: *** Warning: Device is not power manageable
swsusp: Need to copy 28717 pages
Error -12 suspending
Created attachment 49748 [details]
messages.gz
Several ACPI messages are also visible via syslog. Attaching /var/log/messages
(.gz)
As mentioned: I have the notebook here today - any other info needed is easily
obtained (sorry, no serial port available)
Created attachment 49749 [details] messages.gz attachment #49748 [details] was some old file, this one should be correct forgot to click the "info provided" button ACPI: PCI interrupt for device 0000:00:1d.7 disabled
ACPI-0212: *** Warning: Device is not power manageable
This is the key. Find out what device causes this message (probably using printk).
the device info is in the attachment hwinfo.txt, but here is output of lspci -v:
00:1d.7 USB Controller: Intel Corporation 82801DB/DBM (ICH4/ICH4-M) USB2 EHCI
Controller (rev 03) (prog-if 20 [EHCI])
Subsystem: Mitac: Unknown device 8080
Flags: medium devsel, IRQ 7
Memory at febff000 (32-bit, non-prefetchable) [size=1K]
Capabilities: [50] Power Management version 2
Capabilities: [58] Debug port
I am not so sure if this is the problem (i have those Warnings, too but suspend works fine). Try unloading ehci_hcd manually before suspend - if it works with unloaded ehci_hcd, add it to SUSPEND2*_UNLOAD_MODULES in /etc/sysconfig/powersave/sleep. You should read the comment above those variables for instructions. after unloading ehci_hcd manually before suspend, the oops or kernel bug or whatever still persists as above - but missing the two lines mentioned in comment #17 So you still get "Error -12 suspending"? The thing below is kernel BUG(). Similar to oops, but it happens because kernel explicitely tests for condition. BUG() happens in swsusp_suspend, right? Hmm, that BUG_ON is actually wrong; if arch_suspend fails, I could understand nr_copy_pages not matching. Can you enable DEBUG in swsusp.c and see what happens? It fails somewhere in suspend_prepare_image... Created attachment 49896 [details]
This might help/provide better diagnostics.
BTW do you have enough swap space available? (cat /proc/swaps)? Do you feel
like having enough memory for your workload? Are you using highmem?
Compiling a new kernel will take some time (long list of todos at work today). For the other infos: Swap is fine, all tests are done with a fresh system - nothing special running. (KDE defaults + 2 konsoles). Memory is 223768, swap is 514040 (used 0 in /proc/swaps) I don't know if I am running highmem. This is a default SUSE 10.0 RC-1 install. How do I find out? With the patch, the kernel bug is gone. Now I have the original behaviour from comment #1 back. With lots of debug output shortly before reboot. I'll have to find someone with a digital camera tomorrow. Created attachment 49991 [details]
suspend.jpg (picture before reboot)
here the quite unfocused screenshot. With a little good will almost everything
is decipherable ;-)
can you try it with init=/bin/bash as described in http://www.susewiki.org/index.php?title=ACPI_suspend ? It may still be some device driver although i somehow doubt it :-( The system seems to write data to the swap okay. Are you sure you have just one swap partition, resume=XXX on command line, etc? Is your swap signature damaged after reboot? Are you booting right kernel? Now I see progress and success! ad comment #27: yes, only one swap partition; correct resume line. booting right kernel: actually: no; the default was still the official RC-1 kernel, for the suspend-patched kernel manual interaction was needed. Now the suspend-patched kernel is my default. ad comment #26: the init=/bin/bash method works, after ensuring that suspend-patched kernel is default both the commandline "echo disk > /sys/power/state" and the KDE version (right mouse click on the KPowersave plug-button, click on "Suspend to Disk") works! Now, how do we continue? (what part of that suspend patch is really needed and will it be the 10.0 release) Well, the patch only fixed error handling, AFAICT. Can you revert it to see if bug reappears? I think you somehow fixed the core problem in the meantime. Reverting will bring us back to comment #13. That is the official RC-1 version with default kernel and shows me the kernel BUG. [My notebook is at home and I will bring it to work tomorrow.] "I think you somehow fixed the core problem in the meantime." The patch really should not matter, outside of error handling. You are correct. Back to the original default kernel of RC-1 and suspend to disk / resume works as intended. So it somehow fixed itself :-(. |