Bugzilla – Bug 98178
thermal management not working
Last modified: 2005-10-06 17:05:20 UTC
with dynamic/ kernel ondemand frequency scaling, acpi thermal management does no longer work. the cpu does not properly slow down when the thermal_zone is beyond the passive triple point, leading to the temperature become critical, and an instant lockup (sometimes not even critical shutdown has time to react..). this worked fine with 9.3.
# cat trip_points critical (S5): 94 C passive: 80 C: tc1=8 tc2=5 tsp=600 devices=0xdfffedc0 temperature easily reaches 95°C after a few minutes of compiling and the cpu scaling does not go down. scaling governor is properly set to kernel / ondemand.
it works fine when using powersave from 9.3.. cc'ing trenn@suse.de
Puhhh, can I have access on this machine?
sure.. do you need physical access? I'm currently not in Nuernberg.. but if just ssh is needed, you can ping me and then you can ssh dmueller.home.suse.de
Something is wrong with your fans (are you sure they are clean and air can run through?). I couldn't find any obvious ACPI errors... With the userspace governor the machine stays at 1200 MHz and bounces around 80C (passive trip point), that's how it should be. However, passive cooling seem to be broken with current kernels and ondemand governor (at least with speedstep-centrino module). Venkatesh: Do you know about ondemand and passive cooling problems?
Hmmm. This seems totally strange. AFAIK there is no direct relation between cpufreq governor and passive trip point/fan control code. I will take a closer look at the passive cooling code...
the fan is working, its just that its a regular fan and not a jet engine ;) Anyway, I figured a bit more out. http://www.thinkwiki.org/ is an excellent ressource for this. Part a) of the problem is that SaX2 disables frequency scaling on my Radeon graphics processor, which causes it to severly overheat (over 100°C after a few hours). Fixing that improves the situation. However, a "while true; do true; done" bashloop with ondemand frequency scaling governor still will overheat the machine. this is bad. I've tried the acpi-cpufreq driver, which works once you disable powersave and load it manually (it seems powersave prefers the centrino speedstep driver even if told otherwise.. that one cannot be unloaded anymore once it is loaded). However it doesn't make a difference. Anyway, I've read a bit of kernel source and couldn't figure out where teh passive cooling code is going to interact with the freqency scaling code.. it seems they don't, so this is why the overheating happens.
If I understand this issue correctly, powersaved is doing something else for active cooling apart from running the userspace cpu frequency scaling. Things work fine when powersaved is used. And once you disable powersaved completely and use ondemand governor CPU overheats. Right? As you mentioned, kernel ondemand governor only controls the CPU frequency and does nothing with respect to cooling. Can you run the powersaved without userlevel cpufrequency governor (so that powersaved cooling component works) and then run the kernel ondemand governor. This should work fine too.
yes, when powersaved from 9.3 is used (which used userspace frequency scaling that somehow respected passive cooling). powersave from 10.0 however defaults to kernel space frequency scaling. running powersave with kernel ondemand governor fails, because powersave then does nothing at all, and the kernel based governor is causing the machine to runn at full performance all the time.
These are the possible options I see here: 1) Use the current powersaved logic of monitoring the temperature and setting the frequency and use it to control the max frequency for the kernel ondemand governor. Then kernel governor can look at the utilization and change the frequency withing the max-min limit set by the user. 2) Incorporate the temperature knowledge in ondemand governor and make it to set the frequency based on temperature as well. 3) Have a new lightweight userlevel (or kernel driver) which does this temperature monitoring. From my point of view 2 the tougher than other two options. Anyway before I can comment more, I need to look at powersave src rpm from SUSE 9.3 and understand how it does the temperature monitoring.
Sorry, both is wrong: Powersaved does not control passive cooling at all, this is all done by the kernel. You can overwrite the (passive, ...) trip points, that's all. There were some issues some kernel versions ago, but the passive cooling implementation in the kernel seem to work really great now with the userspace governor and I think the ondemand/other governors should use it as well. Learning ondemand governor passive cooling should be easy. The min/max limits seem to be modified by the processor_thermal.c code through pr->limit.thermal.px and pr->limit.state.px. I can't currently see how this is related to (userspace_cpufreq): cpu_set_freq[cpu] = freq; if (freq < cpu_min_freq[cpu]) freq = cpu_min_freq[cpu]; if (freq > cpu_max_freq[cpu]) freq = cpu_max_freq[cpu]; ret = __cpufreq_driver_target(¤t_policy[cpu], freq, CPUFREQ_RELATION_L); but it seem to work. Dominik probably laid his hands on this code, we should ask him how to do it right. I could imagine it's only some lines of code to add to the ondemand governor. I can have a look at it in the beginning of next week, currently I have to take care about the SL 10.0 Beta1... Summary: I think we should go for 2), where we don't have to learn ondemand thermal policy, but just have to make use of the already implemented stuff in processor_thermal.c.
OK. Here goes my next theory (after poking a bit into processor_thermal.c) Processor_thermal.c reduces the cpufrequency both in case of user-level and kernel ondemand policy. It seems to have a timer running every 0.1 second to watch the temperature and take some passive cooling action if required. This is what happens with userlevel policy: 1) Userlevel policy increases the frequency, and processor_thermal.c checks the temperature and reduces the freq. Userlevel coems back after 2 or more seconds (depending on the period) and increases the freq. But, thermal.c almost immediately reduces the freq. So, most of the time we run at lower freq and everythign is fine. 2) With kernel policy we poll very frequently (10 or 100 times a second) and try to increase frequency based on utilization. And thus we increase the frequency as soon as processor_thermal.c reduces the freq. In effect running at highest freq all the time. We need to have a flag in cpufreq, which indicates the CPU is being managed by thermal.c and no other governor can change the frequency during this time. I don't see one now. I can prototype a patch for it. In the meantime, you can try to confirm/deny this theory by: Use the kernel ondemand governor. Change the polling frequency to something high by using the sysfs tunables. Something like this cd /sys/devices/system/cpu/cpu0/cpufreq/ondemand (This dir appears only while using kernel ondemand governor) cat sampling_rate_max (some number) echo (some_number - 1) > sampling_rate And see whether this solves the overheating.
You're right, when I raise ondemand sampling rate over the thermal sampling rate (which seems to default to 5s for my bios), the thermal management works more or less perfectly. I guess a better way would be to dynamically lower the max frequency that can be set based on thermal decisions, so that ondemand doesn't flap between min and max all the time..
I tried to reproduse the problem on my laptop with 2.6.13-rc5 kernel. But, on my system passive cooling seems to work fine with ondemand governor. On my system original passive cooling temperature was 90C and critical shutdown temp was 93C. I set the passive cooling temperature as 60C and the scaling_max_freq came down from 1.5G to 1.2G and moved around between 0.9-1.2-1.5 G with temperature varying from 55 to 69C during the run. System ran at this stage for more than 30 minutes. I did observe CPU running at max 1.5G when temp was 68C, but eventually it came down to 1.2G and temp dropped off. This is a potential issue with original passive setting of 90C and critical shutdown of 93C on my laptop. If the temp can exceed the passive temp for that much, without any reduction in freq, that can be a cause of concern. Can you point me to the SUSE 10 kernel that had the problem. I can try running that kernel on my system and check whether that behaves any different.
ftp://ftp.suse.com/pub/people/mantel/kotd/i386/HEAD/kernel-default.rpm Sorry for the delay. Dirk could you help Venkatesh to nail this problem down, please. I get crushed with work ... If you think this problem can be workarounded reliably with ondemand config variables, maybe we should go that way first, the last Beta deadlines are near ... Trying to make this bug publically available and try to point Dominik to this one.
Venkatesh, it happens with all kernels between preview1 and beta1 (beta2 not tested yet, maybe later tonight). which laptop do you use? I also have 90 and 93°C default trippoints here, and the temperature is quickly exceeded under 100% load. I believe the problem is this: $ cat polling_frequency polling frequency: 5 seconds sometimes the temperature is 89°C, leading to no throttling, and 5s later the temperature is > 95°C, the critical shutdown temperature. reducing the polling frequency to 1 s fixes the issue for me as well.
I am using my Thinkpad T40 for experiments. The temperature doesn't go so high in my case. So, I have reduced the passive cooling temp to 60C. I also think that the problem here is the polling frequency of thermal driver. When set to 60C, I have seen the temperature growing upto 69C, before thermal driver kicks in and reduces the temp. Probably, the chances of this happening is more with ondemand governor as it increases the frequency at the earliest oppurtunity possible. I haven't yet tried the kernel above. I will do it as soon as I get some time.
tried with beta2, no difference. the behavior under full load is the following: it jumps at regular intervals between max cpu freq and min cpu freq, depending on if passive cooling trippoint temperature is exceeded or not. previously with userspace gouvernor, the behaviour was to not jump to the maximum cpu freq, but to a lower one (e.g. 1500 instead of 1700). it seems with 1200MHz I never exceed passive cooling trippoint. the problem with the min/max jump is that heat seems to aggregate, so after a while the heating-up is faster than the heating-down during min-freq periods. That means due to the extremely high polling frequency of 5s its unavoidably reaching the critical shutdown temperature.
another issue I noticed: the cpu fan is running a lot faster in the bios and during booting than after powersave has started (which loads the acpi modules). then the fan slows down and never ever spins up that fast again.
*** Bug 105981 has been marked as a duplicate of this bug. ***
ok, the overheating bug is gone, now it always runs at the lowest frequency after a while.
Could you try whether this behaviour also happens with the userspace governor? I fear it does? Is this valid?: - If the thermal management comes in and the freq is lowered to the lowest frequency, it always stays there, even if passive mode is left again. - If passive mode is entered and the frequency is only lowered to e.g. 1200 MHz (not the lowest freq) it comes up again when passive cooling mode is left Could you also have a look at the /proc/acpi/thermal_zone/*/cooling_mode flag, please? Does it (maybe also?) only stay if passive cooling reaches the lowest freq?
yes, it also happens with the userspace governor. I think it only happens when it reached the lowest frequency once.
I tried to reproduce this with an AMD64 (powernow-k8) machine. There everything seems to work fine. Does this make sense? Very strange...will ask for a speedstep-cenrtino machine.
Just an idea, but Pentium M's are not hyperthreaded and this could be an SMP problem?
hmm? I don't have SMP either :) it seems it is working better with beta3 out of the box. no idea why. currently downloading beta4.
I cannot see that there was an improvement with beta3.
true, in some cases it still doesn't work here either. still overheating. (b3)
Hmm, this is weird. As far as I can tell this only happens on speedstep-centrino driven cpufreq machines. And also there something really strange is happening: sometimes the freq goes down, then jumping up again, even still on too high temp. Maybe the machine warms up/cools down quite quickly, so that temp is not actualised fast enough (we already increased polling freq for temperature on last Beta, maybe because of this it's running a bit better?) You could try to evaluate a difference between: echo 1 >/proc/acpi/thermal_zone/*/polling_frequency echo 10 >/proc/acpi/thermal_zone/*/polling_frequency Is it possible for you to let the acpi-cpufreq driver control speedstep? /etc/sysconfig/powersave/cpufreq CPUFREQD_MODULE="acpi-cpufreq" Does it work better? Lowering severity -> according to Venkatesh it even works on another Thinkpad and I also could not reproduce it on an AMD64. This is something machine specific and probably won't be solved for SL 10.0, sorry.
It is not machine specific I'm afraid. I'm in nuernberg right now btw, if you want to take a closer look at the machine. it just happened last night again when the laptop display was closed and it was compiling.
I'm also experiencing overheating with beta4plus on a HP pavilion zt3000, which is a pentium M 1.5Ghz centrino laptop. It's also in nuernberg at the moment if you want to look at it. /proc/acpi/thermal_zone is empty here - what module do I need to load?
Try to reload module "thermal". It helped on my machine, at least (#114692).
Come around if you like to, I can have a look at it.
I just had the 3rd thermal overheating critical shutdown in the middle of my work. sorry to be annoying, but this is by no means a non-critical bug.
This has been too hard to root-cause, due to unprdictability. Sometimes it seems to work fine. And Sometimes we have a critical shutdown. What are the are all the workarounds (that avoids the issue always) that we have here? 1) Is having thermal polling_frequency of 1s a reliable workaround? 2) Any other configuration changes (like ondemand or userspace polling frequency change) that can workaround this issue?
Maybe acpi-cpufreq driver also works on these machines? /etc/sysconfig/powersave/cpufreq POWERSAVE_CPUFREQD_MODULE="acpi-cpufreq" and thermal management works with it? OOT/Venkatesh: Your latest acpi-cpufreq patch seem to make Pentium III mobile machines working again (#104915), thanks.
I'll test acpi-cpufreq again. last time it didn't make a difference. I've already lowered polling_frequency to 2s, which didn't fix it.
Will's machine is unrelated to this bug -> There is no thermal zone defined in DSDT, there never will be any thermal management working and /proc/acpi/thermal_zone will always be empty (maybe you can configure something in userspace with the HP specific ACPI kernel module). So I have no machine again to debug this. Dirk, ready for another remote session?
sure, but you can as well come into adrian's office
Ok, maybe we can solve this tomorrow. The phaenomen that it worked, had nothing to do with the update_cpufreq_policy in ondemand governor, but with the patch I also added (yes, never try two things in one cycle...) from: http://bugzilla.kernel.org/show_bug.cgi?id=3410 (last attachement) Unfortunately it has the side effect of staying at lowest freq if it is reached in passive cooling mode. I tried to correct the patch -> will attach. Dirk, can you try it please.
Created attachment 49094 [details] fix passive cooling mode
do you already have a kernel with that?
better with that kernel, but not yet perfect.
The whole culprit seems to be that passive cooling is never left. polling frequency switches to _TSP in passive mode. _TSP is declared on this laptop as 600 (in 1/10 seconds) which means every minute it's checked. How I see how the critical shutdown comes: 1) full load -> max_freq 2) passive mode reached -> switching to 1 minute thermal polling (nothing to influence that from userspace) 3) low load -> low freqs 4) cooling down -> high freqs available again 5) passive mode flag stays -> always 1 minute thermal polling 6) full load -> max freq 7) critical temp is reached easily in 1 minute with max freq no patch yet, I give my best ...
Created attachment 49594 [details] fix that leaves passive cooling as soon as highest frequency is allowed again by passive cooling policy Be aware, that if you test this patch, you have to set a value (best 2-10) in /proc/acpi/thermal_zone/*/polling_frequency, or frequency won't be increased any more after leaving passive cooling. Starting the powersave daemon service will do all this.
the patch doesn't seem to work for me.. once the passive cooling kicked in, it never ever returns to the real max frequency, even though the machine cooled down.
Created attachment 49676 [details] thermal fix the patch I developed for my laptop (R50p) and seems to work perfectly (for me).
Will be added to rc. Thanks Dirk, for persistent debugging and helping to find a working final patch.