Bug 113296

Summary: savagefb breaks APM suspend
Product: [openSUSE] SUSE LINUX 10.0 Reporter: Thorsten Kukuk <kukuk>
Component: X.OrgAssignee: Stefan Dirsch <sndirsch>
Status: RESOLVED FIXED QA Contact: E-mail List <qa-bugs>
Severity: Normal    
Priority: P5 - None CC: behlert, koenig, liml
Version: Beta 3   
Target Milestone: ---   
Hardware: Other   
OS: All   
Whiteboard:
Found By: Other Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: Picture of Kernel OOPs

Description Thorsten Kukuk 2005-08-26 12:48:59 UTC
IBM ThinkPad T22. acpi=off is necessary (and configured), else network card does
not work.
Using APM and starting suspend (in all variants, for example with apm -s or
apm -S or Fn Keys or closing the lid) crashes immediatly the machine
and you have to do a hard reboot.
This worked fine with all releases unti 9.3.
Comment 1 Thorsten Kukuk 2005-08-26 12:59:45 UTC
Removing all modules except the minimal necessary 23 modules solves it.
Question is, which of the other modules break it. Somebody is testing it now
with binary search.
Comment 2 Stefan Behlert 2005-08-26 13:01:44 UTC
Ok. Would be good to know which module is responsible. 
Comment 3 Thorsten Kukuk 2005-08-26 14:28:45 UTC
Created attachment 47784 [details]
Picture of Kernel OOPs

The problem is the savagefb module, it oops in
savagefb_suspend. Since most of the hardware is already shutdown and I don't
have a serial console, here is a picture of the oops.
Comment 4 Thorsten Kukuk 2005-08-26 14:30:01 UTC
Removing the module does sovle the problem for suspend. But only suspend to RAM
is possible, suspend to Disk will give a dark peep and nothing happens. Since
there is no message, I don't know who blocks this request. Unloading all
possible modules does not help.
Comment 5 Pavel Machek 2005-08-26 21:57:20 UTC
I have seen some savagefb problems in another bugreport...

Has apm suspend-to-disk ever worked for you?
Comment 6 Pavel Machek 2005-08-26 22:00:24 UTC
[You could try vanilla kernel, to see what happens. IIRC it helped in the other
case. But my memory is *really* vague.]
Comment 7 Thorsten Kukuk 2005-08-26 22:02:29 UTC
Yes, it worked always the last 4 years where I had this notebook.
Comment 8 Pavel Machek 2005-08-26 22:08:41 UTC
Okay, can you file separate bugreport for suspend-to-disk?

BTW you probably should be using swsusp; it works even with apm...
Comment 9 Pavel Machek 2005-08-26 22:32:06 UTC
*** Bug 106049 has been marked as a duplicate of this bug. ***
Comment 10 Forgotten User ZhJd0F0L3x 2005-08-27 08:41:19 UTC
using the "apm" command to trigger suspend is untested at best and unsupported
at worst.
Use
"powersave -m" to trigger "APM standby"
"powersave -u" to trigger "APM suspend"
"powersave -U" to trigger swsusp

i know that there are sometimes problems with invoking suspend via Fn-Fx
keycombos, especially as you can often define in the bios what they are actually
doing (suspend to disk or ram) but cannot determine via software what will be
invoked when issuing an APM suspend call to the BIOS.
But this worked pretty well on 9.3 on a T20 and a TP600, but i never tried
APM/BIOS suspend to disk since it is too slow to be useable and we had reliable
swsusp.
Comment 11 Forgotten User ZhJd0F0L3x 2005-08-27 08:43:10 UTC
btw: we had savagefb problems in 9.2/9.3, too, and the problem is that the
module is loaded at all. Who is loading this?

Thorsten, care to file a bug "savagefb loaded although it should not be"?
Comment 12 Thorsten Kukuk 2005-08-28 09:01:57 UTC
(In reply to comment #10)
> using the "apm" command to trigger suspend is untested at best and unsupported
> at worst.
> Use

Of course I tried them, too.

> "powersave -m" to trigger "APM standby"

Does not work as expected. But since swsusp works I don't care much about it.
More important is fixing the kernel oops.

> "powersave -u" to trigger "APM suspend"

Works fine as expected.

> "powersave -U" to trigger swsusp

Works fine, after fixing broken config written by YaST2 (should be fixed now)
and killing knotify to allow unloading of sound modules (for both bug reports 
exist).
Comment 13 Pavel Machek 2005-08-28 13:42:15 UTC
So... what bugs are left?

powersave -m does not work, but noone cares because swsusp is better.

BTW broken suspend is *not* critical error. It does not corrupt data. I'm not
even sure if broken APM counts as "minor".
Comment 14 Thorsten Kukuk 2005-08-28 19:12:42 UTC
That savagefb triggers a kernel OOPs on any suspend?(In reply to comment #13)
> So... what bugs are left?

The main bug: nothing works if savagefb is loaded since it crashes
the kernel and no suspend is possible.

> powersave -m does not work, but noone cares because swsusp is better.

The other powersave methods crashes the kernel, too.

> BTW broken suspend is *not* critical error. It does not corrupt data. 

Wrong. If you make a suspend call and the kernel crashes you will loose
data or data could get corrupted.

> I'm not even sure if broken APM counts as "minor".

If somethings works for years and we breaks it, we have to fix it. Else
customers will search another distributor (maybe not so important for box
product, but enterprise).
Comment 15 Pavel Machek 2005-08-29 10:05:30 UTC
Is savagefb actually being used on your system?

As to severity; any suspend bug can cause as bad data loss as poweroff. That
would make pretty crappy severity ratings. So it is "normal" unless it resumes
okay and corrupts something in the process. 

I doubt customers care about APM; enterprise customers suspending their servers?
If we really want fixed APM, we need someone to work on it.
Comment 16 Thorsten Kukuk 2005-08-29 11:06:09 UTC
Pavel, I don't care about APM or broken hardware. In this case we know that the
hardware is not broken and that the problem is our kernel, where savagefb OOPsed
if you try any suspend method.

About enterprise customers: Enterprise customers do suspend their Notebooks.
Only look at the feature document for SLES9 how many requests are there. 
Enterprise customers do install SLES on their notebooks.

I don't know if this module is necessary or why it is loaded at all. I haven't
seen that it is in use. I don't care about this. I even don't care about this
extra thread which should have be done in an extra bug report about suspending
to disk with APM. 

We need to fix the fact that our kernel in our default configuration OOPs if you
call any suspend method, by not loading this module or by fixing the module.
Comment 17 Pavel Machek 2005-08-29 11:19:07 UTC
Unless you have savage graphics card, savagefb should not have been loaded in
the first place. I'm not sure who can fix that one.
Comment 18 Harald Koenig 2005-08-29 12:22:34 UTC
(In reply to comment #15)

I seem to have the same problem with my Toshiba Portege 3480 (S3 Savage MX).
see bug #113812.

> Is it actually being used on your system?

not that I'd know why -- but it got loaded.  I'm booting with "vga=normal" but
later in boot the console gets chanted from 80x25 to 100x37.

renaming the kernel module and rebooting fixed the APM problem -- now "apm -s"
suspend/resume works again!!

> As to severity; any suspend bug can cause as bad data loss as poweroff. That
> would make pretty crappy severity ratings. So it is "normal" unless it resumes
> okay and corrupts something in the process. 

oh yes, this is a _severe_ bug.  first it can trash data as Thorsten already
mentioned.
2nd, this renders my notebook useless because I can't (read: do _not_ want to;)
stop/start all apps over and over.  usually my thosiba has uptimes of many
weeks, depending on how much stuff i "play" with;-)

> I doubt customers care about APM; enterprise customers suspending their servers?
> If we really want fixed APM, we need someone to work on it.

pls think about company notebooks too, only only big servers care!
Comment 19 Stefan Behlert 2005-08-29 13:07:06 UTC
REassigned to X.Org-guys. 
Comment 20 Stefan Dirsch 2005-08-29 13:19:58 UTC
Automatically loading of the savagefb has been fixed by adding it to
/etc/hotplug/blacklist.
Comment 21 Forgotten User ZhJd0F0L3x 2005-08-29 13:21:38 UTC
the problem will occur with other *fb drivers, too. Bug #113607
Comment 22 Stefan Dirsch 2005-08-29 15:39:34 UTC
After reading this bugreport I come to the conclusion, that fixing the savagefb
issue should be enough. This has been done. Closing as FIXED.
Comment 23 Thorsten Kukuk 2005-08-29 19:33:41 UTC
I would say putting the driver on the blacklist is the wrong fix for the kernel
Oops. I send a correct fix for this to the kernel list.

Fixing the driver takes me (a non-kernel-developer) less time then this
bugreport ...