Bug 306983 - [grub] Resume hangs after suspend to ram
Summary: [grub] Resume hangs after suspend to ram
Status: RESOLVED WORKSFORME
Alias: None
Product: openSUSE 10.3
Classification: openSUSE
Component: Kernel (show other bugs)
Version: Beta 2
Hardware: x86 openSUSE 10.3
: P5 - None : Normal with 1 vote (vote)
Target Milestone: ---
Assignee: Forgotten User ZhJd0F0L3x
QA Contact: E-mail List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-08-31 23:48 UTC by Jens Nixdorf
Modified: 2007-12-18 12:06 UTC (History)
6 users (show)

See Also:
Found By: Beta-Customer
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments
last pm-suspend log (5.64 KB, text/x-log)
2007-08-31 23:51 UTC, Jens Nixdorf
Details
var/log/messages from Gericom Ego (46.04 KB, text/plain)
2007-11-08 22:11 UTC, Detlef Grittner
Details
pm-suspend.log of Gericom EGO (6.07 KB, text/plain)
2007-11-08 22:12 UTC, Detlef Grittner
Details
pm.suspend log of Lyndon Kroker (7.62 KB, text/plain)
2007-11-22 11:21 UTC, Lyndon Kroker
Details
LILIO bootloader configuration file (1018 bytes, text/plain)
2007-11-23 20:45 UTC, Detlef Grittner
Details
GRUB: /proc/cmdline (107 bytes, text/plain)
2007-12-09 23:01 UTC, Detlef Grittner
Details
LILO: /proc/cmdline (128 bytes, text/plain)
2007-12-09 23:02 UTC, Detlef Grittner
Details
pm-suspend.log with working resume (kernel 2.6.22.13-0.3) @ Toshiba A100 (8.33 KB, text/plain)
2007-12-11 16:46 UTC, Jens Nixdorf
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Jens Nixdorf 2007-08-31 23:48:50 UTC
Bug seems to be similar to closed bug 293662, but is new. After installing OSS 10.3 B2 from scratch (with DVD-ISO) suspend to ram works well including resume. two days later (29.August 2007) i made an update from the factory- and the non-oss-factory-repository via smart. After this (no new kernel) the system does not resume from s2ram correctly:

Screen stays blank, but backlight is switched on, cpu-fan is turning up to its highest state, harddisc is working. After some seconds harddisc-activity stops and also all other activity. System hangs and the only way to get it back to work is to switch it off the hard way (long keypress at the powerbutton).

Next bootlog shows many filesystem-errors, so the filesystems were obviously open at the time the system is hanging.

s2ram -n says:

Machine matched entry 237:
    sys_vendor   = 'TOSHIBA'
    sys_product  = 'Satellite A100'
    sys_version  = ''
    bios_version = ''
Fixes: 0x3  S3_BIOS S3_MODE
This machine can be identified by:
    sys_vendor   = "TOSHIBA"
    sys_product  = "Satellite A100"
    sys_version  = "PSAA9E-01Z01DGR"
    bios_version = "5.90   "
Comment 1 Jens Nixdorf 2007-08-31 23:51:45 UTC
Created attachment 161344 [details]
last pm-suspend log
Comment 2 Frank Seidel 2007-09-02 09:00:38 UTC
What does "rpm -q suspend" show you?
Comment 3 Jens Nixdorf 2007-09-02 09:35:27 UTC
rpm -q suspend says:

suspend-0.50.20070731-12

Comment 4 Jens Nixdorf 2007-09-03 22:50:51 UTC
after an update from factory today suspend is also updated to: 

suspend-0.50.20070731-15

But it doesnt change anything in its worse behaviour.
Comment 5 Jens Nixdorf 2007-09-06 17:12:49 UTC
no improvement in Beta 3 with suspend-0.69.9-2 after update today
Comment 6 Jens Nixdorf 2007-09-06 23:16:34 UTC
Hi,

i made some more investigations today: With booting into a plain sheel (init=/bin/bash) the system is resuming with after following lines:

s2ram -f -p -m
s2ram -f -p -s
s2ram -f -p 

In all cases the screen is staying blank after the resume, but the system is running. As example: a blindly typed "reboot" works. In all other cases the whole machine is dead, not even the caps-lock led is working.

So i tried this then in X11 in a xterm like this: 

s2ram --force --vbe_post

and this is working too. So i wanted to write this down to some config-file, but where is it? In /etc/pm the is not a single file, only three directories (config.d, power.d, sleep.d). So i made a new file /etc/pm/config, wrote S2RAM_OPTS="-f -p" into it, but this seems not to be recognized. 

Aslo at the older place in /etc/powersave (or /etc/sysconfig/powersave) there is no file "sleep", which was used for the suspend-config. Made a new one by hand with the following two lines:

SUSPEND2RAM_FORCE="yes"
SUSPEND2RAM_VBE_POST="yes"

but also without success.

regards and happy bug-hunting, Jens
Comment 7 Jens Nixdorf 2007-09-06 23:19:05 UTC
Addedum to my last post: If suspended by hand from a xterm in X11 the machine resumes correctly, means "its all visible" ;)

regards, Jens 
Comment 8 Jens Nixdorf 2007-09-07 09:08:24 UTC
Addedum to my last post: If suspended by hand from a xterm in X11 the machine resumes correctly, means "its all visible" ;)

regards, Jens 
Comment 9 Frank Seidel 2007-09-07 09:41:32 UTC
you can put this in any file below /etc/pm/config.d/
e.g. as
/etc/pm/config.d/myconfig
with
S2RAM_OPTS="-f -p"
in it.
Comment 10 Jens Nixdorf 2007-09-07 10:45:54 UTC
It works! Thank you! 

But there are still at least two questions open:

1. Why it needs now these options to resume correctly, even though it was working before (with Beta 2 "out of the ISO") without them?

2. Why the german documentation (http://de.opensuse.org/pm-utils) is that misunderstandable? It says that there is a main config-file "/etc/pm/config" and only additionally config-files have to go below "/etc/pm/config.d" (now i've seen the english docs, they seem to be more precisely).

regards, Jens
Comment 11 Forgotten User ZhJd0F0L3x 2007-09-07 11:57:35 UTC
yes, the english docs still mention /etc/pm/config, i will fix this soon.
Comment 12 Jens Nixdorf 2007-09-14 13:30:03 UTC
Now, after a new update to kernel 2.6.22.5-16 and suspend 0.69.9-5 its getting worse (or better, depends on viewers perspective): now my system dont resume with the options given in /etc/pm/... anymore, but it resumes again without any option. OK, its good to have it back again without handiwork, but at some point my filesystem will be gone because of these constant faulty resumes. 
Comment 13 Frank Seidel 2007-09-15 11:11:19 UTC
This reads as its now working (suspend+resume), yes?
So, i'll close this bug.
Comment 14 Jens Nixdorf 2007-09-15 13:48:14 UTC
I'm not sure, if it has reasy resolved, because at least one person (Gerald Pfeifer) reported in opensuse-factory@opensuse.org, that in his system suspend is not functioning since the same date when my system starts working again.
Comment 15 Jens Nixdorf 2007-09-29 17:31:16 UTC
With an update to kernel 2.6.22.5-29 and suspend-0.69.9-14 (RC1) i have again the same behaviour as like before: machine doesnt resume anymore. Therefore i reopened this bug, because this part of the powermanagement seems to be not very solid. Later today i will try with which parameters resume works again.
Comment 16 Jens Nixdorf 2007-10-05 09:22:40 UTC
With the final release of OSS 10.3 (kernel 2.6.22.5-31 and suspend-0.69.9-15) still no working resume. 
Comment 17 Lyndon Kroker 2007-10-14 06:57:38 UTC
I am having the same problem.  My laptop is a Stamp 223.  This was sold in the USA by Linux Certified (LC2100).  The general specs can be found here:

http://www.linuxcertified.com/linux-laptop-lc2100.html

For as long as I have been using openSUSE on this laptop suspend to ram has always worked well using "s2ram -f".  It was working well as late as 10.3 Beta 3.  Now, when I try to suspend to ram using 10.3, the laptop locks up hard and the power button must be held for four seconds to get the computer to power down.

Troubleshooting was done in a minimal environment (init=/bin/bash) as per the instructions on the openSUSE wiki (http://en.opensuse.org/S2ram).  When I suspend the laptop, all seems to go well.  However, when I push the power button to recover from the suspend the laptop locks up and remains completely unresponsive.  It won't respond to blindly typed commands or the CAPS LOCK key.  Additionally, the cooling fan comes on at full speed.

Suspend to disk works pretty well except for the wireless network connection which does not automatically reconnect to the previously used network.
Comment 18 Pavel Machek 2007-10-23 22:24:14 UTC
Okay, try nohz=off, it seems to help sometimes.
Comment 19 Pavel Machek 2007-10-29 09:07:13 UTC
Did nohz=off work?

Can you try recent mainline kernel?
Comment 20 Jens Nixdorf 2007-10-29 14:34:20 UTC
Where should i write this nohz=off? At the GRUB-Prompt or somewhere into /etc/pm/... ?
Comment 21 Forgotten User ZhJd0F0L3x 2007-11-02 13:21:33 UTC
at the grub prompt. It is a kernel commandline parameter.
Comment 22 Jens Nixdorf 2007-11-07 15:49:18 UTC
Sorry for the delay. At this time my system (Kernel is 2.6.22.9-0.4-default, suspend is 0.69.9-15) seems to resume again, this time WITHOUT any options in /etc/pm. So at this time nohz=off doesnt make sense. I will try it with the next kernel-update. But dont set this bug to "resolved" early, wait for some kernel-updates. 
Comment 23 Detlef Grittner 2007-11-08 22:11:45 UTC
Created attachment 182723 [details]
var/log/messages from Gericom Ego
Comment 24 Detlef Grittner 2007-11-08 22:12:22 UTC
Created attachment 182725 [details]
pm-suspend.log of Gericom EGO
Comment 25 Detlef Grittner 2007-11-08 22:13:07 UTC
I have a similar problem here on a Gericom EGO (MS1003). 
Suspend and Resume only work by chance, i.e. only in about 1 of 3 cases the resume is successful. Actually "s2ram -f" works, the other options do not make a real difference in the statistical distribution of successful and failed resumes.

I have tried this with the latest offical kernel. I will attach the pm-suspend.log and a complete /var/log/messages file. The latter one includes everything beginning with an installation until the first failed resume and a forced reboot.

Just to give you a point of reference: With Ubuntu 7.10 suspend and resume work all the time on this machine. 
Comment 26 Pavel Machek 2007-11-09 18:01:46 UTC
Can you do the separate bug for Gericom?

You said you tried with latest kernel, does it work okay there?
Comment 27 Detlef Grittner 2007-11-11 16:10:55 UTC
Just for clarification: Do you want me to open a new bug or report to this one?


I have some news about the problem itself:
It doesn't work with the latest kernel.
At least sort of, because I have made the following observation:
For testing purposes I have a dual installation of OpenSUSE 10.3 and Ubuntu 7.10 on the machine. 
Using the Grub loader provided with OpenSUSE I can reproduce the problem, when I do a suspend to disk and after the resume try to suspend to RAM and resume. The latter one fails in most cases and after that suspend to RAM and resume doesn't even work after a reboot.

Then I changed to the Grub loader of Ubuntu and simply integrated the menu.lst entries of Suse 10.3 into that of Ubuntu. Now suspend to RAM, to disk, resumes from both states work on Suse 10.3 as well, I didn't have any failure during my experiments.

I can observe at least two differences between the Grub versions: Ubuntu doesn't change the loading process when there is a suspend to disk, whereas the Suse loader manipulates at least the menu.lst for skipping the menu selection during startup. When I use the Ubuntu loader, then Suse writes into its own menu.lst with no effect of course. Ubuntu uses a text based Grub menu and Suse has a graphical one. 

Comment 28 Jens Nixdorf 2007-11-12 17:34:36 UTC
Since the last kernel-update to kernel 2.6.22.12-0.1 some days ago my system resumes correctly without any entries below /etc/pm and without nohz=off at the GRub-Prompt. Only one time (out of 10) KDE was hanging after resume, but thats another thing.
Comment 29 Jens Nixdorf 2007-11-13 16:20:18 UTC
Today resume fails again. No reaction from keyboard (even no capslock LED when capslock is pressed), screen stays blank, background light is on, cpu-fan is running high. 

Nothing has changed at my system since my posting from yesterday, so it seems suspend/resume is still unstable.

 
Comment 30 Dirk Engel 2007-11-14 08:23:00 UTC
Same problem here (2.6.22.12-0.1). Resume from s2ram fails approximately one of five times in excatly the way like described above.
Comment 31 Pavel Machek 2007-11-15 16:51:12 UTC
Dirk, Jens: can you repeat your tests 10times or so,  so that testing is reproducible? Good things to try are:

1) nohz=off highres=off

2) minimal system (init=/bin/bash)


Comment 32 Dirk Engel 2007-11-15 19:34:32 UTC
Pavel,
tried variant 1 (nohz=off highres=off) but tested only twice since resume failed immediately so that I to switch off the system both times.
Comment 33 Jens Nixdorf 2007-11-15 20:31:46 UTC
Same here as Dirk described already: with nohz=off and highres=off the system doesnt resume. After suspend it blinks one time with the Capslock-LED, fan is starting to run, screen stays blank with enabled backlight, than it is "dead". Only tried three times, everytime resume was failing. Dont want to test it more times, because i have to switch it off hard, and this isnt very good for my filesystems.
	
Unfortunately 'init=bin/bash' doesnt work with my system anymore (it was working sometimes ago in Beta 2), if i want to suspend i got this errormessage: 

Switching from vt1 to vt1
/proc/sys/kernel/acpi_video_flags does not exit; you need a kernel >=2.6.16.
switching back to vt1
Comment 34 Forgotten User ZhJd0F0L3x 2007-11-20 15:55:01 UTC
with init=/bin/bash, you need to

mount /proc
mount /sys
s2ram $YOUROPTIONS
Comment 35 Detlef Grittner 2007-11-20 22:54:44 UTC
I have replaced the bootloader GRUB with LILO. The result is astonishing. Since then I have not had a single failure on suspend to disk or RAM and resume.
As parameter a simple "-f" for s2ram is sufficient.

When changing back to GRUB the old behavior with hanging resume reappeared.

Maybe you can make sense of this strange behavior of OpenSUSE on the Gericom EGO (MSI 1003). 
Comment 36 Frank Seidel 2007-11-21 05:17:58 UTC
Detlef: could you explain in short how you setup your grub? is this a chainloading one, where first you get the grub you setup via ubuntu which then chainloads to the one from suse?

Jens: could you retry with init=/bin/bash as Stefan showed you? .. and post your pm-suspend.log again?
Comment 37 Lyndon Kroker 2007-11-22 11:17:59 UTC
I just tried switching from GRUB to LILO and had the same pleasant result as Detlef.  Suspend to RAM is now working very well with "s2ram -f".  I have added a small file called suspend to /etc/pm/config.d.  In the file I put:

S2RAM_OPTS="-f"

Now I can suspend perfectly as a normal user.  I haven't done too many suspends yet but I suspect that I not longer have the problem and suspend is working very well just as it did before openSUSE 10.3.  If I encounter any stablility issues, I will let you know.

I have attached my pm-suspend.log file.  Note that I was able to include a log file for GRUB.  The last few times I tried to suspend using GRUB, no log file was created.  Not even a blank file of zero bytes.

My initial installation of GRUB was the default one from 10.3 (ie: I didn't make any changes or modifications).
Comment 38 Lyndon Kroker 2007-11-22 11:21:08 UTC
Created attachment 184378 [details]
pm.suspend log of Lyndon Kroker
Comment 39 Lyndon Kroker 2007-11-22 13:26:16 UTC
Sorry, there was a typo on my previous comment.  It should say: "Note that I was *not* able to include a log file for GRUB".

Getting tired!
Comment 40 Jens Nixdorf 2007-11-22 16:11:47 UTC
sorry for the delay. Today i found time to check this with init=/bin/bash:

s2ram without options -> failed completely
s2ram -f -> failed completely
s2ram -f -p -> resumes, keyboard is working, but display is staying blank
s2ram -f -p -s -> resumes, keyboard is working, but display is staying blank
s2ram -f -p -m -> resumes, keyboard is working, but display is staying blank
s2ram -f -p -m -s -> works!

with init=/bin/bash and nohz=off highres=off NOTHING works.

When it works only with "-f -p -m -s" in this small environment (init=/bin/bash), why it is working approx. 5 of 10 times in the standard environment, even though there are no options in /etc/pm/...?

Normally i would say that i take these options for a config-file in /etc/pm/..., but i have to fear that with the next kernel-update the same procedure starts again like its happen at least three times since 10.3 Beta 2.

Which pm-suspend.log do you want to have? Like it is at the moment, without any option in /etc/pm/...? After a failed resume? After a working resume?
Comment 41 Detlef Grittner 2007-11-22 21:32:14 UTC
Frank: For this test I have installed OpenSUSE 10.3 with the out the box GRUB setup. There are no additional bootloaders besides Windows, but it is chainloaded by the GRUB provided by the OpenSUSE installation.

Comment 42 Forgotten User ZhJd0F0L3x 2007-11-23 10:45:42 UTC
For all those where switching to LILO helped: do you still have the graphical boot menu? Maybe the gfxmenu is killing something for suspend. Although i cannot imagine how it could do that... :-(
Comment 43 Detlef Grittner 2007-11-23 20:45:25 UTC
Created attachment 184592 [details]
LILIO bootloader configuration file

I have attached the lilo.conf from my notebook. The graphical boot menu is used.
Comment 44 Forgotten User ZhJd0F0L3x 2007-11-28 10:17:45 UTC
ok, in this case i am totally out of ideas on where that problem might come from, sorry :-(
Comment 45 Pavel Machek 2007-11-30 22:41:36 UTC
Lilo vs. grub makes s2ram to work or not...? That's seriously strange.

Can you dump /proc/cmdline in both cases?
Comment 46 Dirk Engel 2007-12-03 09:02:54 UTC
:-(
Tried to change from grub to lilo to fix this s2ram issue. But now I get nothing more than a blinking cursor when I turn on my notebook. I also tried to switch back to grub with the repair option of my openSUSE 10.3RC1 but got errors only.
I know that's not the topic here but maybe you have some hint.
Thanks
Comment 47 Pavel Machek 2007-12-03 09:23:47 UTC
Dirk: you may be able to boot the rescue cd with root=your/real/root, then rerun lilo.
Comment 48 Pavel Machek 2007-12-07 10:38:15 UTC
(any progress?)
Comment 49 Detlef Grittner 2007-12-09 23:01:32 UTC
Created attachment 186561 [details]
GRUB: /proc/cmdline
Comment 50 Detlef Grittner 2007-12-09 23:02:00 UTC
Created attachment 186562 [details]
LILO: /proc/cmdline
Comment 51 Pavel Machek 2007-12-10 08:12:57 UTC
Hmmm, in grub configuration you are using vesafb, while you are using plain vgacon in lilo case. Change grub configuration into vga=0 (or just remove vga parameter) and it should start to work.

Stefan, this machine probably needs "NOFB".
Comment 52 Dirk Engel 2007-12-10 09:54:55 UTC
Pavel: I am on grub again (thank you), lilo somehow does not work for me. My cmdline states vga=0x317 too, which worked wonderfully with openSUSE 10.2, so why should vga=0 be necessary now? Nevertheless, I will try it of course. Provides s2ram any possibility to go in suspend mode without switching the console?
Comment 53 Dirk Engel 2007-12-10 19:04:12 UTC
So, I tried grub with vga=0 and without vga= at all but the problem persists. Resume fails all 4 to 5 times. Same for echo "mem" > /sys/power/state.
BTW, I experience one difference to the initial problem description of Jens I forgot to mention so far: backlight keeps off if system hangs.
Comment 54 Detlef Grittner 2007-12-10 20:16:36 UTC
Although it is true that LILO loads the kernel image on a plain VGA console, it switches to vga=0x317 before the init scripts run, i.e. after showing a VGA console with "loading OS...", the system switches to the green SUSE load screen with the progress bar. I think the line "vga = 0x317" in my LILO.conf (already attached) causes this.

Comment 55 Forgotten User ZhJd0F0L3x 2007-12-11 12:28:05 UTC
Steffen, can you enlighten us on comment #54, please?
Comment 56 Steffen Winterfeldt 2007-12-11 12:36:12 UTC
Which part of it is dark?
Comment 57 Jens Nixdorf 2007-12-11 16:43:14 UTC
New kernel, new behaviour (again): 

with kernel 2.6.22.13-0.3 and suspend-0.69.9-15 it seems to work here. No options set in /etc/pm. Tried suspend to ram five times from the "real OS", not from "init=/bin/bash", and always resumed correctly. But don't halloo till you're out of the wood, following murphy it will stop working after i clicked onto the "commit"-button for this comment...

I will attach a new pm.log

regards, Jens
Comment 58 Jens Nixdorf 2007-12-11 16:46:59 UTC
Created attachment 186951 [details]
pm-suspend.log with working resume (kernel 2.6.22.13-0.3) @ Toshiba A100
Comment 59 Forgotten User ZhJd0F0L3x 2007-12-11 17:08:26 UTC
(In reply to comment #56 from Steffen Winterfeldt)
> Which part of it is dark?

How does LILO apparently boot with vesafb, even though /proc/cmdline does not show any vga=$foo entry?
Comment 60 Steffen Winterfeldt 2007-12-11 17:38:28 UTC
Oh, that's trivial. The vga mode is not passed via cmdline (but written
into kernel setup block by the boot loader). That it shows up in cmdline
with (e.g.) grub is just gimmick. (Note that 'vga' and 'append' are
separate options in lilo.conf.)
Comment 61 Dirk Engel 2007-12-13 21:04:32 UTC
I've figured out that my problem depends on nvidia driver. Hangs only occur if any glx application is active, without s2ram works fine with grub.
Comment 62 Jens Nixdorf 2007-12-16 00:50:48 UTC
Some days later, still with kernel 2.6.22.13-0.3 and suspend-0.69.9-15, it fails again. Old behaviour, blank screen, keyboard dead, fan is running high. It works approx. 10 times vs 1 failure.

regards, Jens
Comment 63 Forgotten User ZhJd0F0L3x 2007-12-18 12:03:12 UTC
(In reply to comment #61 from Dirk Engel)
> I've figured out that my problem depends on nvidia driver.

NVIDIA driver -> WONT/CANTFIX, complain to NVIDIA
Comment 64 Forgotten User ZhJd0F0L3x 2007-12-18 12:06:24 UTC
(In reply to comment #62 from Jens Nixdorf)
> Some days later, still with kernel 2.6.22.13-0.3 and suspend-0.69.9-15, it
> fails again. Old behaviour, blank screen, keyboard dead, fan is running high.
> It works approx. 10 times vs 1 failure.

fglrx -> WONT/CANTFIX. Complain to ATI.

If there is anybody without proprietary drivers still seeing this bug, please reopen (or maybe better: just file a new one, since this one is already hard to understand and extract the relevant information from).

For those with proprietary drivers: please reproduce without (and then file a new bug), or complain to the driver authors.

Thanks for your cooperation.