Bug 137573

Summary: acpi shuts down IBM thinkpad R32 right after boot up
Product: [openSUSE] SUSE LINUX 10.0 Reporter: michel munnix <michel.munnix>
Component: KernelAssignee: Thomas Renninger <trenn>
Status: RESOLVED WONTFIX QA Contact: E-mail List <qa-bugs>
Severity: Major    
Priority: P5 - None    
Version: Final   
Target Milestone: ---   
Hardware: i586   
OS: Other   
Whiteboard:
Found By: Other Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: messages+/proc/acpi/thermal_zone+DSDT

Description michel munnix 2005-12-08 13:59:14 UTC
Just after having reached run level 3, acpi shuts down to level 0
This seems to be related to thermal zone THM1 :
actual temperature : 50°C, critical trip point : 50°C
as actual T° >= crit T°, it shuts down.
Analysing DSDT shows that critical T° is 323.2°K = 50°C,
it shows also that only two actual T° values are possibly generated:
323.0°K = 49.8°C and 323.5°K=50.3°C. This seems to be a digital output from the embedded controler which is converted back to an imaginary temperature.
I suspect that the acpi driver computes with temperatures rounded to integer degrees which gives 49.8==50.0, but the acpi specification is about 1/10 degrees, so the result is incorrect.
As work around, I do
echo -n "51:0:0:0:0" >/proc/acpi/thermal_zone/THM1/trip_points
but this is not optimal as it completely deactivates this thermal zone.
Attaching several logs
Comment 1 michel munnix 2005-12-08 14:01:17 UTC
Created attachment 60109 [details]
messages+/proc/acpi/thermal_zone+DSDT
Comment 2 michel munnix 2005-12-09 07:54:05 UTC
I have transfered the HD in an other R32 laptop and it behaves the same way.
After taking a look to the acpi kernel code, I must admit that my assumption about it working with rounded values was incorrect. Values are only converted when printed out or output to /proc.
Do you have a contact at IBM/Lenovo who could clarify the meaning of that thermal zone, could it be that it is intended to work with a _PSV value of 50°C instead of a _CRT value ?
Comment 3 Thomas Renninger 2005-12-09 09:23:34 UTC
I saw these strange thermal zones before (hehe, it probably was the R32).
Be sure nothing is overriding the trip points (e.g. be sure ENABLE_THERMAL_MANAGEMENT="" is set to off in /etc/sysconfig/powersave/thermal).
Also don't override the thermal zone yourself. Because of rounding problems (kernel cuts of positions after the decimal point) the kernel will shutdown the machine as soon as you write the same critical trip point to /proc/acpi/thermal_zone/*/trip_points.

(assumed values) e.g.:
exported trip point value by BIOS: 49.9 C shown in /proc as: 50
exported temperature for the thermal zone: 49.7 C shown in /proc as 50 C
echoing 50 C to /proc will exceed the exported 49.9 C critical trip point.

The machine runs fine as long as you do not touch the trip points.
See it as a kernel or a BIOS bug. I already mentioned this to ACPI kernel developers, but as this is the only machine this will only be changing when ACPI interface may switch to /sys and this could take a while ...

Please reopen if you don't find the app writing to /proc/acpi/thermal_zone/.../trip_points