Bug 145747 - Time runs too fast on ATI chipsets with AMD Turion CPUs
Summary: Time runs too fast on ATI chipsets with AMD Turion CPUs
Status: RESOLVED FIXED
: 146516 (view as bug list)
Alias: None
Product: SUSE Linux 10.1
Classification: openSUSE
Component: Kernel (show other bugs)
Version: Beta 5
Hardware: Other Other
: P5 - None : Major (vote)
Target Milestone: ---
Assignee: Andreas Kleen
QA Contact: E-mail List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-01-26 09:14 UTC by Thomas Renninger
Modified: 2006-03-04 22:34 UTC (History)
4 users (show)

See Also:
Found By: Other
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments
dmesg of beta3 kernel (20.93 KB, text/plain)
2006-02-02 10:54 UTC, Bodo Bauer
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Thomas Renninger 2006-01-26 09:14:07 UTC
watch -n1 date
reveals that time runs about at doubled speed.
no_timer_check boot option seems to workaround the problem on some machines.
Affected machines: Acer Ferrari 4000, HP nx6125 and probably all others with this chipset.
Comment 1 Andreas Kleen 2006-01-26 09:23:36 UTC
I have it fixed (or rather properly workarounded) in my x86-64 mainline queue, just need to integrate the patches.
Comment 2 Thomas Renninger 2006-01-26 09:27:41 UTC
Added people with affected machines so that they can *verify*.
Comment 3 Andreas Kleen 2006-01-26 10:22:38 UTC
Checked in 

-------------------------------------------------------------------
Thu Jan 26 11:22:01 CET 2006 - ak@suse.de

- patches.arch/x86_64-apic-main-timer: Allow to run main time
  keeping from the local APIC interrupt (#145747).
- patches.arch/x86_64-apic-main-timer-ati: Automatically enable
  apicmaintimer on ATI boards (#145747).
Comment 4 Thomas Renninger 2006-02-02 10:12:40 UTC
Timer interrupts are misssing now.
Machine hangs at boot, you have to press keys (generate Interrupts) so that the machine goes on. There are Softlockup kernel exceptions in idle ...

This is on the Acer Ferrari, maybe it works on the nx6125.
Both had the too fast timing issue and have similar chipsets. Not sure if Joe and Ihno also see this with the nx6125, Joe told me he can give it to me today and I will verify.
Comment 5 Andreas Kleen 2006-02-02 10:17:58 UTC
With the beta3 kernel? And it worked before? And are you sure the Ferrari
has an ATI chipset? (I thought they were Nvidia) If yes does it work with "noapicmaintimer" ? If not ATI does it work with "apicmaintimer" ?

 
Comment 6 Bodo Bauer 2006-02-02 10:54:49 UTC
Created attachment 66159 [details]
dmesg of beta3 kernel
Comment 7 Andreas Kleen 2006-02-02 10:59:03 UTC
The oops will be fixed soon. And please try the options I suggested.
Comment 8 Bodo Bauer 2006-02-02 12:22:30 UTC
I tried "noapicmaintimer" and it makes things worse. The boot preocess hangs when the HDs are detected 'hda: lost interrupt', and predding keys doesn't help anymore. 

Comment 9 Andreas Kleen 2006-02-02 12:30:48 UTC
Please answer all questions. In particular what version did work?

Anyways, it sounds your machine has both irq 0 routing problems and broken C3. Booting without c3 might work (Add options processor max_cstate=1 in 
/etc/modprobe.conf) 

Comment 10 Thomas Renninger 2006-02-02 12:39:43 UTC
Yes it definetly has an ATI chipset, I also saw the message "Using local APIC timer interrupts" ... "timer: PIT interrupt stopeed".
I noticed that the interrupts stop (or keyboards pressing gets necessary) after the fs check.
I now realised that it seem that the need of pressing keys seems to vanish after some time? Not sure what caused this.
The time seems to run relativley stable (1-2 secs per minute too slow, but does not run as smooth with watch -n1 date as on other machine, sometimes it gets 2 seconds difference in a time, this is only to figure out, nothing sever).

Also strange is that the timer interrupts do not increment(two time i8042?):
           CPU0
  0:        109    IO-APIC-edge  timer
  1:       3882    IO-APIC-edge  i8042
  8:          0    IO-APIC-edge  rtc
 12:       6385    IO-APIC-edge  i8042
 14:      25664    IO-APIC-edge  ide0
 15:      10226    IO-APIC-edge  ide1
169:       3273   IO-APIC-level  acpi, ohci1394
177:          1   IO-APIC-level  yenta
209:          1   IO-APIC-level  ATI IXP
225:          0   IO-APIC-level  ehci_hcd:usb1, ohci_hcd:usb2, ohci_hcd:usb3
233:       7135   IO-APIC-level  eth0
NMI:         77
LOC:      31182
ERR:          0
MIS:          0

Got it!
Removing the acpi modules helps (->Just read possible C3 problems, that makes sense, will try -> see results in next message)!
Hmm, but I still get the softlockup in processor_idle(with ACPI modules unloaded).
Not sure, but it definetely changes something, no key pressing necessary anymore without ACPI modules. Still no timer interrupts in /proc/interrupts.
watch -n1 date, jumps 3 or more seconds now in one time.

Booting with noapicmaintimer hangs at the end of disk initialisation:
hda: XY, ATA DISK drive
...
hda: lost interrupt

Just an idea, maybe this one is related to the nx6125 problem and some place in ACPI is not 64 bit safe? Therefore the nx6125 bit is not read correctly and on the Ferrari some interrupt initialisation because some BIOS reads/writes are wrongly masked/whatever?

I also encountered this unrelated one:
Feb  5 03:36:24 linux klogd: IPv6 over IPv4 tunneling driver
Feb  5 03:36:24 linux klogd: xfrm_lookup: IPv4 route is stale (obsolete=4294967295, loops=0)
Feb  5 03:36:24 linux klogd:
Feb  5 03:36:25 linux klogd: Call Trace: <ffffffff802b4e68>{xfrm_lookup+1174} <ffffffff881f3a39
>{:ipv6:ip6_dst_lookup+481}
Feb  5 03:36:25 linux klogd:        <ffffffff88212fc2>{:ipv6:ip6_datagram_connect+799} <fffffff
f8025ecae>{lock_sock+197}
Feb  5 03:36:25 linux klogd:        <ffffffff8025eb49>{release_sock+25} <ffffffff8025dfe8>{sys_
connect+118}
Feb  5 03:36:25 linux klogd:        <ffffffff8010d6e9>{syscall_trace_enter+190} <ffffffff8010a7
dc>{tracesys+209}

Shall I open another bug report for that?

Sorry all info is a bit messed up here, I just write down I am just seeing... Tell me if things need more clarification.
Comment 11 Andreas Kleen 2006-02-02 13:56:15 UTC
Did the machine work with 10.0?

If yes we can use the same hack there ("disable_timer_pin_1"). Or alternatively disable c2/c3. The problem with the 10.0 hack is that ATI recommends a BIOS workaround and when that workaround is in then it will not work with the 10.0 hack

I'm inclined towards just blacklisting c2/c3. It's nasty, but it's easy and safe.
Comment 12 Bodo Bauer 2006-02-02 14:00:03 UTC
Yes, 10.0 works. Not sure that I have the latest BIOS though. Will check...
Comment 13 Andreas Kleen 2006-02-02 14:14:15 UTC
Then it should work with "noapicmaintimer disable_timer_pin_1" If you add
dmidecode output I can black list force that.


Comment 14 Bodo Bauer 2006-02-02 14:36:13 UTC
Nope, I tried "noapicmaintimer disable_timer_pin_1" and still get the same behaviour (hda: lost interrupt) as with just "noapicmaintimer". 


I jsut updated the BIOS BTW (now at 3A23), and SL10.0 still works :)
Comment 15 Andreas Kleen 2006-02-02 15:10:01 UTC
Did the beta1 kernel work?
Comment 16 Bodo Bauer 2006-02-02 15:18:18 UTC
Beta2 did work, I didn't try Beta1 on that machine.

Comment 17 Andreas Kleen 2006-02-10 12:55:13 UTC
*** Bug 146516 has been marked as a duplicate of this bug. ***
Comment 18 Andreas Kleen 2006-02-10 13:39:06 UTC
Ok i have a new plan now to attack this. Will need a bigger patch though.

Comment 20 Thomas Renninger 2006-02-14 11:34:09 UTC
Changing product to SL, hopefully others now can see it...
Comment 21 Andreas Kleen 2006-02-27 14:51:36 UTC
Should be fixed now in next KOTD which should have this changelog
entry. Can people who see this test please and provide feedback?

-------------------------------------------------------------------
Sun Feb 26 21:28:54 CET 2006 - olh@suse.de

- update to 2.6.16-rc4-git10, x86_64 updates

Comment 22 Thomas Renninger 2006-02-28 15:48:20 UTC
Works for *arne* (the workstation). Bodo/Joe/Inho please reopen if it you still see problems with Turions.
Comment 23 David Canar 2006-03-01 01:15:26 UTC
Hi, I'm using beta 5 on an HP Pavilion ze2380 (Turion using a Sempron CPU) I upgraded the kernel to 2.6.16-rc4-2 and I got the clock speed 2 or 3 times faster than it should. The last time it worked was with KOTD 2.6.16-rc4-20060218181704-default, I tried the very last kernel 2.6.16-rc5-git2-20060228152604-default but I got the same results. I think it is the same bug but probably I'm wrong.
Comment 24 Andreas Kleen 2006-03-01 01:23:16 UTC
It's hard to believe. There weren't any timer changes before rc5-git2. Please double check this. And are you sure you're using a 64bit kernel?

Comment 25 David Canar 2006-03-01 01:34:14 UTC
I've always used a 32bit kernel because I don't know if the Sempron processor is a 64-bit processor. I don't know much about this topic (so probably I'll say something very stupid) but I think it has a 64bit "ready" motherboard with a 32bit processor. Probably I'm completely wrong. If I would like to test a 64bit kernel do I have to upgrade the whole SUSE to a 64 bit distro or can I upgrade just the kernel?
Comment 26 Andreas Kleen 2006-03-01 01:41:11 UTC
The 32bit kernel doesn't have the fix right now. But we should probably put 
it in because it uses apic by default now. It still works with noapic, right?
Comment 27 David Canar 2006-03-01 01:52:04 UTC
Yes, it works fine using noapic
Comment 28 Andreas Kleen 2006-03-03 12:57:53 UTC
Should be fixed now. 
-------------------------------------------------------------------
Thu Mar  2 18:40:27 CET 2006 - ak@suse.de

- patches.arch/i386-fix-ati-timer: Port the x86-64 ATI timer fix over to i386
Comment 29 David Canar 2006-03-03 14:19:36 UTC
The latest kernel works fine Thanks!
Comment 30 Jure Repinc 2006-03-04 22:34:22 UTC
I had this clock speed and slow system problem with beta 4 on HP Compaq nx6125 laptop with Turion 64. I now installed beta 6 and all is fine. Installation was running at normal speed and the desktop runs fast again. So I guess I can only confirm that the problem is fixed. Thanks to all!