|
Bugzilla – Full Text Bug Listing |
| Summary: | thermal management prefers to throttle even if ALLOW_THROTTLING="no" | ||
|---|---|---|---|
| Product: | [openSUSE] SUSE Linux 10.1 | Reporter: | Dirk Mueller <dmueller> |
| Component: | Kernel | Assignee: | Thomas Renninger <trenn> |
| Status: | RESOLVED FIXED | QA Contact: | E-mail List <qa-bugs> |
| Severity: | Critical | ||
| Priority: | P5 - None | ||
| Version: | Alpha 3plus | ||
| Target Milestone: | --- | ||
| Hardware: | Other | ||
| OS: | Other | ||
| Whiteboard: | |||
| Found By: | Other | Services Priority: | |
| Business Priority: | Blocker: | --- | |
| Marketing QA Status: | --- | IT Deployment: | --- |
| Attachments: | the patch | ||
|
Description
Dirk Mueller
2005-12-12 10:04:15 UTC
Arggh, this probably was the SL10.0-RCx fix thermal passive cooling patch that also went mainline.
IIRC, I also tested the throttling case, but it seems one if/else path still is wrong ...
Dirk, you also know the code very well, could you have a look at it, I don't know what to do first at the moment and this one is really critical.
Ohh, you are the reporter..., so it worked for 10.0, but it does not for 10.1? That's strange...
> even worse, throttling is disabled in the performance scheme of powersave.
This is done by kernel, no influence possible, still default behaviour should be if the passive trip point is exceeded: first cpufreq, then throttling
and if the passive trip point is sub-ceeded: first go up with throttling again, then cpufreq.
hmm, ok, I've looked into it a bit. The good news is: there is no bug. The bad news is: That behaviour is suboptimal. The thermal code tries to throttle first and then later tries to reduce cpu frequency it seems (BTW, I was unable to find out if your ondemand-kernel-handler-always-caches-cpufreq-policy patch is still in our suse tree (couldn't figure out how to access kerncvs) or not). Anyway, I think this behaviour is suboptimal, because throttling only leads to linear heat reduction, while speed stepping leads to quadratic heat reduction. If I run compilation jobs, it before reduced the cpu frequency to 1.2GHz after a while and ran stable (without overheating) with that. Now it stays at 1.7GHz and throttles down by 93%, which leads to a system that is as slow as a 486 without turbo button, but still scratches the overheating limit all the time). I couldn't find documentation though.. is it "faster" for the cpu to reduce throttling or stepping up the speed? which method has more overhead? It seems to me it should prefer whatever can react quicker to changes. ok, there could be one bug: even though it throttles down maximum, it never actually speed steps down (even though it should, because it is above the configured trip point). will have a look at this maybe tonight. Frequency scaling must be the preferred and the first method to cool down. Tell me if you found anything, I have a machine with throttling and cpufreq ATM and could also have a look. I could reproduce this..., the behaviour changed to older kernels and there must definitley be a bug somewhere. I haven't looked more closely yet, but indeed, rereading the source confirms that it should first speed-step and later throttle. adding some more printk's.. the:
if (!cpu_has_cpufreq)
checks in acpi_thermal_cpufreq_increase/decrease (processor_thermal.c)
must be
if (cpu_has_cpufreq)
This was the change in mainline (reworked by someone) that I think causes the grief (have a closer look to changing "!"):
static int cpu_has_cpufreq(unsigned int cpu)
{
struct cpufreq_policy policy;
- if (!acpi_thermal_cpufreq_is_init)
- return -ENODEV;
- if (!cpufreq_get_policy(&policy, cpu))
+ if (!acpi_thermal_cpufreq_is_init || cpufreq_get_policy(&policy, cpu))
return -ENODEV;
return 0;
Am I right, or have i overseen something?
I am in a hurry currently, I come up with a patch tomorrow.
BTW: I am in a real hurry... not tested yet, only assumptions..., bye. indeed, that seems to work. I decided to fix the return values instead, which seems to make the stuff more logical to me (all callers of the static expect the inverse meaning) Created attachment 61508 [details]
the patch
Thanks. Patch retestet, added to stable and applied for mainline. Dirk: I forget to mention you, wasn't meant to, I just forgot it, sorry. No problem, I already posted the stuff to acpi-devel@ last night: http://sourceforge.net/mailarchive/forum.php?thread_id=9289494&forum_id=6102 Though, no reaction so far. |