Bug 137459

Summary: kernel oops when selecting AggressivePowersave
Product: [openSUSE] SUSE Linux 10.1 Reporter: Forgotten User OS1JNCFbCX <forgotten_OS1JNCFbCX>
Component: KernelAssignee: Holger Macht <hmacht>
Status: VERIFIED FIXED QA Contact: E-mail List <qa-bugs>
Severity: Normal    
Priority: P5 - None CC: dkukawka, trenn
Version: Alpha 3   
Target Milestone: ---   
Hardware: i686   
OS: SUSE Other   
Whiteboard:
Found By: Beta-Customer Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---

Description Forgotten User OS1JNCFbCX 2005-12-07 21:15:16 UTC
I found that there is a new powermanagement scheme "AggressivePowersave" in kpowersave. When selecting this scheme on my ThinkPad T42p the kernel oopses in the following way:

ACPI: PCI interrupt for device 0000:00:1f.5 disabled
ACPI: PCI interrupt for device 0000:02:01.0 disabled
They asked me for state 3
------------[ cut here ]------------
kernel BUG at drivers/pci/pci.c:403!
invalid operand: 0000 [#1]
Modules linked in: sg st sd_mod sr_mod scsi_mod ath_pci aes wlan_ccmp af_packet ipt_pkttype ipt_LOG ipt_limit cpufreq_ondemand cpufreq_userspace cpufreq_powersave speedstep_centrino freq_table edd ibm_acpi button battery ac ip6t_REJECT ipt_REJECT ipt_state iptable_mangle iptable_nat ip_nat iptable_filter ip6table_mangle ip_conntrack nfnetlink ip_tables ip6table_filter ip6_tables ipv6 wlan_scan_sta pcmcia firmware_class ath_rate_sample wlan ath_hal snd_intel8x0 snd_ac97_codec e1000 snd_ac97_bus yenta_socket rsrc_nonstatic pcmcia_core snd_pcm uhci_hcd ehci_hcd snd_timer snd shpchp usbcore intel_agp soundcore pci_hotplug snd_page_alloc generic agpgart i8xx_tco i2c_i801 i2c_core tsdev joydev ide_cd cdrom dm_mod parport_pc lp parport reiserfs fan thermal processor piix ide_disk ide_core
CPU:    0
EIP:    0060:[<c01f46db>]    Tainted: P     U VLI
EFLAGS: 00010282   (2.6.14.2-2-default)
EIP is at pci_choose_state+0x4b/0x60
eax: 0000001d   ebx: 00000003   ecx: ffffffff   edx: 0006e2d2
esi: dff4c000   edi: dff4c000   ebp: 00000002   esp: f5bc1ef0
ds: 007b   es: 007b   ss: 0068
Process powersaved (pid: 3743, threadinfo=f5bc0000 task=f5b07030)
Stack: c031d622 00000003 dfc57260 dfc57000 f9d558da 00000003 dff4c044 00000000
       caef9300 00000003 c01f6403 c0252d46 dff4c108 dff4c044 00000003 caef9300
       e9574380 c0253310 dff4c044 00000001 c025340e cc8e0001 c02533e0 c024e6ca
Call Trace:
 [<f9d558da>] e1000_suspend+0x10a/0x250 [e1000]
 [<c01f6403>] pci_device_suspend+0x13/0x20
 [<c0252d46>] suspend_device+0xb6/0xc0
 [<c0253310>] dpm_runtime_suspend+0x30/0x60
 [<c025340e>] state_store+0x2e/0x50
 [<c02533e0>] state_store+0x0/0x50
 [<c024e6ca>] dev_attr_store+0x1a/0x20
 [<c018f1ad>] flush_write_buffer+0x1d/0x30
 [<c018f1fd>] sysfs_write_file+0x3d/0x70
 [<c018f1c0>] sysfs_write_file+0x0/0x70
 [<c0157859>] vfs_write+0x99/0x160
 [<c01579cc>] sys_write+0x3c/0x70
 [<c0102ddb>] sysenter_past_esp+0x54/0x79
Code: 89 f0 ff d1 85 c0 78 02 89 c3 83 fb 00 75 05 31 c0 5b 5e c3 7c 0a b8 03 00 00 00 83 fb 02 7e f1 53 68 22 d6 31 c0 e8 25 7d f2 ff <0f> 0b 93 01 3e d6 31 c0 31 c0 5e 5a 5b 5e c3 8d b6 00 00 00 00

If you need detailed hardware information drop me a note.
Comment 1 Greg Kroah-Hartman 2005-12-08 01:57:38 UTC
Is this oops at boot time, or after things have been running?

It looks like you were trying to suspend the laptop.  Without the different powersave method, does suspend work properly?
Comment 2 Forgotten User OS1JNCFbCX 2005-12-08 10:04:29 UTC
As I wrote above this did not happen on boot but when I selected the scheme "AggressivePowersave" in kpowersave (the KDE powersave control applet) but I cannot reproduce that any longer at the moment. I did not actively invoke any suspension mode when this problem did happen.

Suspend-to-disk works without any problem on that system. I haven't recently tried suspend-to-ram but this used to work some weeks ago. If this was important, I could try that as well.

Because of the fact that the problem does not seem to be reproducable I am not sure whether it makes sense to do something about that.
Comment 3 Olaf Kirch 2005-12-08 12:09:13 UTC
Pavel, can you look into this please? Thanks!
Comment 4 Pavel Machek 2005-12-08 13:45:07 UTC
No, it is not suspend. Something is trying to suspend e100 at runtime. We should not do that, really. Userspace should be fixed not to try doing runtime suspend. Interface is likely to change, and code is not even close to working.
Comment 5 Holger Macht 2005-12-08 14:25:38 UTC
Sorry, but we already implemented it in the powersave daemon and testet it on several machines and never got an oops. And it can save up to one hour when being on battery.

The new scheme AggressivePowersave will disappear for future versions and a new one for Experimental settings will be created also including runtime device power management but the user will be expliticely warned about a possible danger.

There are a few new commandline options for the powersave user binary to test the new feature. With powersave version 0.10.21 these are:

Output of 'powersave -h':
[...]
 Runtime powermanagement: (experimental)
   -i --rtpm-suspend <device class>       set devices of a device class into D3 power save mode
   -I --rtpm-resume <device class>        set devices of a device class back to D0 power mode
   -j --rtpm-device-number <x>            only suspend/resume device number <x>
   -J --rtpm-get-devices <device class>   List devices from a specific device class
   -o --rtpm-get-classes                  List available device classes
[...]

If you like, you can prevent the network interface from being shut down with removing 'lan' from /etc/powersave/scheme_aggressive_powersave --> DPM_DEVICES

Additionally, many of those those settings and variables have changed in powersave 0.11.0 (not public yet).
Comment 6 Holger Macht 2005-12-08 14:27:08 UTC
If the problem is reproduceable and persists, we could blacklist such modules/devices like the e100 like we do for a suspend to disk/ram. 
Comment 7 Pavel Machek 2005-12-08 15:51:11 UTC
Kernel support for this is incomplete. It probably misses critical locking, and interface is incomplete/broken.

Now... it is possible that it happens to work with some drivers, as long as you try to stop driver when it is idle and not try to use it afterwards. [Could you try using the device after you turned it off on some of your test systems?]

I'm afraid that this is going to cause bug reports all around the kernel. I'd prefer us not doing this, not even as experimental option. If you still want to do that, write some message to syslog ("Unsupported runtime power management on."). It would be nice to taint the kernel so that we don't have to deal with all the bugreports. It can cause problems even after you turn it off, or one hour after you enabled it.
Comment 8 Pavel Machek 2005-12-08 15:56:08 UTC
Oh BTW and sensible way to do this is unload modules that are not in use, and make sure device is powered down (PCI_D3hot or something) in module remove function.
Comment 9 Holger Macht 2005-12-09 08:43:10 UTC
One of our aims is exactly what you are afraid of ;-) Finding broken modules/drivers and getting them fixed. And to be honest, I can't understand why this implementation even is in the kernel and not disabled if it is so broken and critical to use.

A message in syslog would be good, indeed. And we will also warn the user with a big popup through our clients (kpowersave) that he is trying to use an experimental feature.

We tried a lot of machines with this feature and never got serious problems. We also tried to access the device after suspending and did not notice serious problems. For the sound device for example, if it is suspended, an 'echo foo > /dev/dsp' blocks until the device is resumed. Taking the network device with module 8139too, network is unavailable when suspended and at least the NetworkManager reinitializes it on resume. Works good.

But I am aware that our small test environment could not cover the real world. Therefore I released packages on sourceforge to get also other systems and machines. And this bugreport is the first outcome. I would propose to let it integrated for now and wait for further user experiences to pop up. If we get too much problems, we can disable this feature by default or even remove it.
Comment 10 Forgotten User ZhJd0F0L3x 2005-12-09 08:58:23 UTC
regarding comment #8: you cannot unload module and reload them on demand when some "pling" wants to access the soundcard, but you can come out of runtime-suspend on demand.
We know, that proper runtime power management probably belongs into the kernel. But it will take years until we get that. We get 10-20% extra battery time with what we have now and it works on some machines. We expose the drivers' rarely used codepaths and we will discover driver bugs. This is good IMO, because we will hit the same bugs later, when the policy is put in the kernel, so if we find them now, it will all work better in the future. Also, with some experience from using this early stuff, maybe some design insufficiencies can be detected and adressed before the future in-kernel architecture is set in stone.

Spewing out dangerously looking warnings is fine and even tainting the kernel. But how do i do it? Can i just echo to /proc/sys/kernel/tainted?
And which taint do i choose? Maybe a "forced module load" taint is appropriate since it is in the same class of "unsupportedness"?
Comment 11 Pavel Machek 2005-12-14 11:06:14 UTC
Unfortunately /proc/sys/kernel/tainted is read-only :-(.

You are not using "rarely used codepaths", you are using "codepaths that are currently not used at all". And it does not help future too much, because runtime power management will probably need to be redesigned before doing it "for real".
Comment 12 Pavel Machek 2005-12-14 11:14:18 UTC
One more comment... this should probably be developed on mainline. Patrick Mochel probably knows more about this code than me. Could you make some announcement of aggresive powersave on l-k and linux-pm@lists.osdl.org, and see what feedback it generates? This interface is also likely to change in future. (Currently it is 0/3 IIRC, caricature of PCI_Dx states).

Comment 13 Holger Macht 2005-12-15 10:30:53 UTC
Will do so...
Comment 14 Holger Macht 2005-12-15 15:27:27 UTC
BTW: I can reproduce the OOps with the e100 module.
Comment 15 Pavel Machek 2005-12-15 18:45:04 UTC
It crashed with ipw2200... if I echo -n 3 > .../power/state. (That's equivalent to what aggressive powersave is, right?)
Comment 16 Forgotten User ZhJd0F0L3x 2005-12-16 10:21:50 UTC
it did not crash with the ipw2200 from 10.0, ipw2200 from 2.6.15 is seriously broken anyway, so it might well be broken from a powermanagement POV, too.
Comment 17 Holger Macht 2006-03-03 14:34:24 UTC
Stefan, can you please try if it still crashes with Beta6?
Comment 18 Holger Macht 2006-03-13 20:36:23 UTC
I close the bug the bug for now. Please reopen if the problem persists.
Comment 19 Holger Macht 2006-03-27 09:17:47 UTC
Forgot to close.