Bug 396220

Summary: AMD Sempron 3500/3600+ needs highres=off and nohz=off boot params
Product: [openSUSE] openSUSE 10.3 Reporter: uli geins <uli.geins>
Component: InstallationAssignee: Thomas Renninger <trenn>
Status: RESOLVED WONTFIX QA Contact: Jiri Srain <jsrain>
Severity: Minor    
Priority: P5 - None CC: andreas.herrmann3, eumaster, forgotten_aFJloKvMbR, uli.geins, v.plessky
Version: Final   
Target Milestone: ---   
Hardware: All   
OS: openSUSE 10.3   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: processor.max_cstate=2 - single mode
processor.max_cstate=1
boot.msg, acpi=debug, 2.6.25.5-23-default
cat /proc/acpi/processor/*/power Kernel 2.6.21.7 openSuse10.2
dmesg kernel 2.6.21.7 openSuse10.2
dmesg kernel 2.6.25.5-1.1-pae opensuse 11
dmesg, lsmod and lspci kernel 2.6.22.18-0.2-default opensuse10.3
working opensuse10.3 only with nohz=off
opensuse 11 kernel 2.6.25.10-0.2-default without nohz=off dmesg lspci lsmod (system crash)
opensuse 11 kernel 2.6.25.10-0.2-default with nohz=off dmesg lsmod working system

Description uli geins 2008-06-01 15:35:09 UTC
Hi,
i have spocken with thomas hönig at the linuxday 2008 about my problem and he means, that this is a bug wich i should post to bugzilla to get help.

i have a actual notebook MSI-M670, modell MS-1632. bios build A1632NMS version 7.0c wich is the latest. it use the following komponents:

north-bridge: nvidia c51mv
south-bridge: nvidia mcp51m

amd sempron mobile processor 3500+
samsung hybrid storage 160gb
2gb ram

Following Problem:
i can only install opensuse 10.3 and 11.0 with acpi=off, because my system hang up while booting or at kde-boot.
all kernel parameters for acpi and apic wich are discrype on the opensuse database have no effect. 
with the opensuse 10.2 und the newest kubuntu version 8.04 wich use a 2.6.24xx kernel it works fine. since kernel 2.6.22 to the latest it won't work under opensuse.
sorry for my bad english. please help me to get my notebook work under opensuse....
Comment 1 uli geins 2008-06-01 19:48:02 UTC
(In reply to comment #0 from uli geins)
> Hi,
> i have spocken with timo hönig at the linuxday 2008 about my problem and he
> means, that this is a bug wich i should post to bugzilla to get help.
> 
> i have a actual notebook MSI-M670, modell MS-1632. bios build A1632NMS version
> 7.0c wich is the latest. it use the following komponents:
> 
> north-bridge: nvidia c51mv
> south-bridge: nvidia mcp51m
> 
> amd sempron mobile processor 3500+
> samsung hybrid storage 160gb
> 2gb ram
> 
> Following Problem:
> i can only install opensuse 10.3 and 11.0 with acpi=off, because my system hang
> up while booting or at kde-boot.
> all kernel parameters for acpi and apic wich are discrype on the opensuse
> database have no effect. 
> with the opensuse 10.2 und the newest kubuntu version 8.04 wich use a 2.6.24xx
> kernel it works fine. since kernel 2.6.22 to the latest it won't work under
> opensuse.
> sorry for my bad english. please help me to get my notebook work under
> opensuse....
> 

Comment 2 Evgeny Byrganov 2008-06-11 19:41:45 UTC
My notebook (MSI S430x Mobile AMD Sempron(tm) Processor 3600+) work only with acpi=off too. (install and normal boot)

With SLED10 SP0..SP2 (preinstalled)  it work fine, but I dont have support after 60 days. :( and SLED10 don't have native driver for WIFI (rt73usb)

uname -a:
2.6.22.17-0.1-default #1 SMP 2008/02/10 20:01:04 UTC i686 athlon i386 GNU/Linux

# lspci
00:00.0 RAM memory: nVidia Corporation C51 Host Bridge (rev a2)
00:00.1 RAM memory: nVidia Corporation C51 Memory Controller 0 (rev a2)
00:00.2 RAM memory: nVidia Corporation C51 Memory Controller 1 (rev a2)
00:00.3 RAM memory: nVidia Corporation C51 Memory Controller 5 (rev a2)
00:00.4 RAM memory: nVidia Corporation C51 Memory Controller 4 (rev a2)
00:00.5 RAM memory: nVidia Corporation C51 Host Bridge (rev a2)
00:00.6 RAM memory: nVidia Corporation C51 Memory Controller 3 (rev a2)
00:00.7 RAM memory: nVidia Corporation C51 Memory Controller 2 (rev a2)
00:03.0 PCI bridge: nVidia Corporation C51 PCI Express Bridge (rev a1)
00:05.0 VGA compatible controller: nVidia Corporation MCP51 PCI-X GeForce Go 6100 (rev a2)
00:09.0 RAM memory: nVidia Corporation MCP51 Host Bridge (rev a2)
00:0a.0 ISA bridge: nVidia Corporation MCP51 LPC Bridge (rev a3)
00:0a.1 SMBus: nVidia Corporation MCP51 SMBus (rev a3)
00:0a.3 Co-processor: nVidia Corporation MCP51 PMU (rev a3)
00:0b.0 USB Controller: nVidia Corporation MCP51 USB Controller (rev a3)
00:0b.1 USB Controller: nVidia Corporation MCP51 USB Controller (rev a3)
00:0d.0 IDE interface: nVidia Corporation MCP51 IDE (rev a1)
00:0e.0 IDE interface: nVidia Corporation MCP51 Serial ATA Controller (rev a1)
00:10.0 PCI bridge: nVidia Corporation MCP51 PCI Bridge (rev a2)
00:10.1 Audio device: nVidia Corporation MCP51 High Definition Audio (rev a2)
00:14.0 Bridge: nVidia Corporation MCP51 Ethernet Controller (rev a3)
00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration
00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map
00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller
00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control


Comment 3 Thomas Renninger 2008-06-12 12:54:17 UTC
I expect idle=poll boot parameter helps?
If yes, please provide:
cat /proc/acpi/processor/*/power
Does processor.max_cstate=1 also help if yes, maybe even processor.max_cstate=2 helps (makes only sense if above power file shows C3).
Comment 4 Evgeny Byrganov 2008-06-12 13:52:59 UTC
Yes, it's with idle=poll better. 

% cat /proc/acpi/processor/P001/power
active state:            C1
max_cstate:              C8
bus master activity:     00000000
maximum allowed latency: 8000 usec
states:
   *C1:                  type[C1] promotion[C2] demotion[--] latency[000] usage[00000000] duration[00000000000000000000]
    C2:                  type[C2] promotion[C3] demotion[C1] latency[005] usage[00000000] duration[00000000000000000000]
    C3:                  type[C3] promotion[--] demotion[C2] latency[020] usage[00000000] duration[00000000000000000000]


What is "processor.max_cstate"? sysctl don't know about it:
sysctl processor.max_cstate
error: "processor.max_cstate" is an unknown key


Comment 5 Evgeny Byrganov 2008-06-12 14:02:45 UTC
also I tried:
1. kernel-default-2.6.22.18-197.1.i586.rpm from SL103_BRANCH_KMP
- don't work too (without idle=poll).

2. And  kernel-default-2.6.25.5-23.1.i586.rpm - last kernel which I found for 10.3, it work, but from time to time system is died when I enable wlan.
Comment 6 Thomas Renninger 2008-06-12 14:21:12 UTC
> What is "processor.max_cstate"?
a boot parameter:
processor.max_cstate=1 will probably work as it restricts the processor module to only use C1.
processor.max_cstate=2 will probably work as it restricts the processor module to only use C2.
idle=poll does not use any C-states or halt instruction.
I expect that the timer is switched off in deeper sleep states (C3).
Strange is that latest kernel sometimes works...
Can you also try the processor.max_cstate=X parameter, pls. Then you save a lot more power. The question then is whether Sempron Mobile 3500/3600 should be blacklisted to not use C3 or whether the apictimer should not be used in this case.
Comment 7 Andreas Herrmann 2008-06-12 14:51:14 UTC
Would be great to have dmesg output of your system, at best with
option "apic=debug" added to your command line.

So far I see two possible root causes:

(1) Either the APIC timer is used which will stop working if the machine goes
into C3. If this causes your problems, then booting with "noapictimer"
should help. 

(2) Wrong setup for timer interrupt routing. (That is why I requested more debug
information). In such cases using one of
- acpi_use_timer_override or
- acpi_skip_timer_override
might help (especially for the nVidia MCP51 chipset).

I guess it's (1) that causes your system hang -- this is also Thomas'
assumption -- but who knows.

If all this does not help, chances are high that the problem is caused
by some other misconfiguration of your hardware and a BIOS update is
required to solve the problem. If a new BIOS is available at all ;-(
Comment 8 Evgeny Byrganov 2008-06-12 15:05:52 UTC
processor.max_cstate=2 - system doesn't work. 

Last messages (boot process):
---------
JDB: barrier-based sync failed on dm-1
Configured serial ports  done 
--------


I will try apic=debug now.
Comment 9 Evgeny Byrganov 2008-06-12 15:30:22 UTC
Created attachment 221818 [details]
processor.max_cstate=2 - single mode

system doesn't boot
Comment 10 Evgeny Byrganov 2008-06-12 15:31:37 UTC
Created attachment 221819 [details]
processor.max_cstate=1

system started
Comment 11 Evgeny Byrganov 2008-06-12 15:34:14 UTC
with processor.max_cstate=1

% cat /proc/acpi/processor/P001/power
active state:            C1
max_cstate:              C1
bus master activity:     00000000
maximum allowed latency: 8000 usec
states:
   *C1:                  type[C1] promotion[C2] demotion[--] latency[000] usage[00261441] duration[00000000000000000000]
    C2:                  type[C2] promotion[C3] demotion[C1] latency[005] usage[00000000] duration[00000000000000000000]
    C3:                  type[C3] promotion[--] demotion[C2] latency[020] usage[00000000] duration[00000000000000000000]
Comment 12 Evgeny Byrganov 2008-06-12 16:51:48 UTC
Created attachment 221847 [details]
boot.msg,  acpi=debug, 2.6.25.5-23-default

I added boot.msg when I boot with  2.6.25.5-23-default kernel. (for comp.)

This kernel work good, if I dont enable wifi :(, when it die (hang up)
But may be this another bug.
Comment 13 Evgeny Byrganov 2008-06-12 16:56:45 UTC
"noapictimer"  produces other problem - disk(dma error), keyboad, mouse etc.

I stop on processor.max_cstate=1 now.

Comment 14 uli geins 2008-06-12 19:48:32 UTC
Hi everyone!
 
* "noapictimer": have the causes dma errors.....the same as evgeny

* 2.6.25.5-23-default the system works 10 min to half an hour with acpi, or it             hang at booting the system at "enable cpufreq"......
my idea is to change the k8 driver wich is from AMd at my opensuse 10.2. is here the failure?
the wifi works fine when it work...

the question is, why does it work until kernel 2.6.21.7 with opensuse 10.2 and the actual kubuntu 8.04. where is the differenze. is it the blacklist?

tonight i am driving in holliday until next week friday. when i'm back i will going to post all the messages....
should i am setting up the final opensuse 11 or do you want the messages from the beta opensuse 11?????
Comment 15 Evgeny Byrganov 2008-06-13 01:19:29 UTC
I downloaded and  installed last RC version opensuse11, and found this issue too.

Without  processor.max_cstate=1 it doesn't work.



Comment 16 Thomas Renninger 2008-06-13 08:34:31 UTC
Thanks so far for all your investigations.
Simply limiting C-states to C1 might not be the ideal solution as you loose the "Mobile" part of your Mobile Sempron and it consumes more power.

We should still investigate a bit further, especially if it (possibly) worked already:
> the question is, why does it work until kernel 2.6.21.7 with opensuse 10.2 and
> the actual kubuntu 8.04. where is the differenze. is it the blacklist?
This is a very good hint! I try to find out what could have caused it. Hmm, this should be the time when the huge timer changes came in..., this might get difficult. Does anyone still have a working distribution (10.2 or Ubuntu 8.04) installed? If yes can you check whether the C-states are really used there:
cat /proc/acpi/processor/*/power
Also a dmesg (10.2 is preferred, but at least a working kernel) of a working kernel to compare could help.
Comment 17 Thomas Renninger 2008-06-13 09:38:55 UTC
It is not in drivers/acpi/processor_idle.c
Therefore I more expect the culprit comes from different apic/timer setup.

If you are sure 2.6.21.7 does work and (both plain vanilla kernel) 2.6.22.X does not work you may want to try a git bisect.
The first time this can be a bit cumbersome, but this often is the easiest way to find difficult kernel problems.

You should use the latest git tools from 11.0, do:
git-clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 <dir>

To download the repository. man git-bisect gives some hints, if you are sure that plain 2.6.22 is broken you probably want to start with:
git bisect start
git bisect bad  v2.6.22
git bisect good v2.6.21
git bisect good         # if the next tested kernel worked
...

git bisect always takes the middle of applied patches between good and bad until the offending patch is found.
That means you have to recompile the kernel after each try, boot it, tell git about the outcome:
git bisect good|bad
recompile and try again...
As this algorithm is logarithmic (halves amount of patches after each try), this works out relatively quickly...

If you have questions about how to compile and or install the kernel, just ask.
Comment 18 Evgeny Byrganov 2008-06-16 14:48:22 UTC
Now I tested with other option - I replaced "processor.max_cstate=1" on to  "nohz=off highres=off" and got working system!

Also I see the activing 'C3'
cat /proc/acpi/processor/P001/power
active state:            C3
max_cstate:              C8
bus master activity:     00000000
maximum allowed latency: 8000 usec
states:
    C1:                  type[C1] promotion[C2] demotion[--] latency[000] usage[00000030] duration[00000000000000000000]
    C2:                  type[C2] promotion[C3] demotion[C1] latency[005] usage[00002135] duration[00000000000011279621]
   *C3:                  type[C3] promotion[--] demotion[C2] latency[020] usage[00034222] duration[00000000000228307174]


I tested on opensuse11rc (2.6.25.4-8-pae) and  10.3 (2.6.22.17-0.1-default), it work fine.

Comment 19 Thomas Renninger 2008-06-16 17:26:50 UTC
Great!
We should get Thomas Gleixner and/or Ingo Molnar on the boat (not sure whether this works out, I'll point them to the bug).

Summary:
  - apic timer breaks on C2 or deeper
  - noapictimer workaround helps to come a bit further, but results in sever
    other errors (see comment #13)
  - "nohz=off highres=off" works (tested on 2.6.25.4-8-pae and
    2.6.22.17-0.1-default SUSE kernels)
  - processor.max_cstate=1 works (things break when entering C2 or deeper)
  - This affects AMD Mobile Semprons (at least 3500+ and 3600+)
Comment 20 uli geins 2008-06-21 13:45:04 UTC
Created attachment 223560 [details]
cat /proc/acpi/processor/*/power Kernel 2.6.21.7 openSuse10.2

funktional acpi opensuse 10.2 kernel 2.6.21.7
Comment 21 uli geins 2008-06-21 13:46:09 UTC
Created attachment 223561 [details]
dmesg kernel 2.6.21.7 openSuse10.2

dmesg from working acpi kernel 2.6.21.7 openSuse10.2
Comment 22 uli geins 2008-06-21 13:52:28 UTC
hi, i'm back from hollidays.

i have postet as a attachment the acpi-timer and the dmesg from my running kernel-2.6.21.7 on opensuse 10.2.

tonight i am going to install opensuse 11 an after installation i will going to post the same from the aktual kernel....

i have a second problem since kernel 2.6.22 to the latest. my "FN-Keys" doesn't work for controlling the volume and suspend. is this a kerneloption????
Comment 23 uli geins 2008-06-22 20:27:59 UTC
Hi everyone,

good news. i have install opensuse 11.0 with acpi=off, because the system crash.
then i have put "nohz=off highres=off" into the menu.lst and it is working fine....---)))) 

here the c-state from opensuse 11.0 kernel 2.6.25.5-1.1-pae:

MSI:/home/ugeins # cat /proc/acpi/processor/*/power
active state:            C0
max_cstate:              C8
bus master activity:     00000000
maximum allowed latency: 16000 usec
states:
    C1:                  type[C1] promotion[--] demotion[--] latency[000] usage[05679079] duration[00000000000000000000]
    C2:                  type[C2] promotion[--] demotion[--] latency[005] usage[00000000] duration[00000000000000000000]
    C3:                  type[C3] promotion[--] demotion[--] latency[020] usage[00000000] duration[00000000000000000000]
MSI:/home/ugeins #                                            

the processor is controlled dynamikly and the system doesn't crash.

but opensuse 11 is too buggy. i will now install opensuse 10.3 with the kernel 2.6.18.xxx wich is crashed at booting....and try out "nohz=off highres=off"...

great..:---))))

as attachmnet i have added dmesg from kernel 2.6.25.5-1.1-pae.
i hope it is usefull.......


Comment 24 uli geins 2008-06-22 20:32:55 UTC
Created attachment 223625 [details]
dmesg kernel 2.6.25.5-1.1-pae opensuse 11

working system with "nohz=off highres=off"
dmesg kernel 2.6.25.5-1.1-pae opensuse 11
Comment 25 uli geins 2008-06-25 21:35:57 UTC
Hi,

with "nohz=off highres=off" in openSuse 10.3 Kernel 2.6.22.18-0.2-default as grub bootparameter it is working fine too. the installation of opensuse 10.3 must start with parameter acpi=off, because system crash at the beginning.

here is the c-state of opensuse 10.3 with "nohz=off highres=off".

geins@MSI-Lan:~>  cat /proc/acpi/processor/*/power
active state:            C3
max_cstate:              C8
bus master activity:     00000000
maximum allowed latency: 8000 usec
states:
    C1:                  type[C1] promotion[C2] demotion[--] latency[000] usage[00000140] duration[00000000000000000000]
    C2:                  type[C2] promotion[C3] demotion[C1] latency[005] usage[00015995] duration[00000000000069470897]
   *C3:                  type[C3] promotion[--] demotion[C2] latency[020] usage[00149023] duration[00000000000775417219]
geins@MSI-Lan:~>

as attachment i have leave a dmesg from Kernel 2.6.22.18-0.2-default.

thank you for supporting evgeny and me!!! very good......
what is the next step of you? a kernel-patch? 

mfg
uli
Comment 26 uli geins 2008-06-25 21:43:24 UTC
Created attachment 224426 [details]
dmesg, lsmod and lspci kernel 2.6.22.18-0.2-default opensuse10.3

working opensuse10.3
dmesg, lsmod and lspci kernel 2.6.22.18-0.2-default opensuse10.3
Comment 27 Thomas Renninger 2008-07-09 12:55:30 UTC
Can you check whether highres=off is enough to get the machine working (it should be?).
If yes, I try to come up with a blacklist. But this is arch independent code and it needs x86info lookup..., let's see, I should be able to provide a patch...
Comment 28 uli geins 2008-07-11 13:54:06 UTC
hi,

my maschine works also only with nohz=off. here is the c-state:

MSI-Lan:/home/geins # cat /proc/acpi/processor/*/power
active state:            C3
max_cstate:              C8
bus master activity:     00000000
maximum allowed latency: 8000 usec
states:
    C1:                  type[C1] promotion[C2] demotion[--] latency[000] usage[00000280] duration[00000000000000000000]
    C2:                  type[C2] promotion[C3] demotion[C1] latency[005] usage[00020546] duration[00000000000080768202]
   *C3:                  type[C3] promotion[--] demotion[C2] latency[020] usage[00164909] duration[00000000000773400717]
MSI-Lan:/home/geins #

my prozessor is stepping fine and all other hardware is also running.....great.
as attachments i leave the same as comment#26.
Comment 29 uli geins 2008-07-11 13:59:37 UTC
Created attachment 227316 [details]
working opensuse10.3 only with nohz=off

working opensuse10.3 only with nohz=off
attachment contain dmesg, lspci and lsmod
Comment 30 Thomas Renninger 2008-07-12 13:40:20 UTC
Can you try one of these 11.0 kernels (32/64 bit):
ftp.suse.com/pub/people/trenn/amd_sempron_mobile_no_hz_fix

If it does work you should also see a message in dmesg:
XXX: ...
AMD Sempron found ...

If not, can you boot this kernel again with nohz=off
and send whole dmesg output or grep for:
XXX:
I need this and the next line then.
Comment 31 uli geins 2008-07-13 18:36:13 UTC
Hi Thomas,
i have done the kernelupdate to 2.6.25.10-0.2 and boot without any grup parameters. the system works fine now without nohz=off. as attachment i have leave you the same from your kernelpatch as above. there is one fault after detecting the cpu in dmesg, but the system work. here is the line i mean:

powernow-k8: Found 1 Mobile AMD Sempron(tm) Processor 3500+ processors (1 cpu cores) (version 2.20.00)
powernow-k8:    0 : fid 0xa (1800 MHz), vid 0xe
powernow-k8:    1 : fid 0x8 (1600 MHz), vid 0x10
powernow-k8:    2 : fid 0x0 (800 MHz), vid 0x18
Clocksource tsc unstable (delta = -277791000 ns) <----- i mean this line!!!!!

the cpu is controlling dynamicly. the system works fine. here is also the c-state:

linux-zw94:/home/geins # cat /proc/acpi/processor/*/power
active state:            C0
max_cstate:              C8
bus master activity:     00000000
maximum allowed latency: 16000 usec
states:
    C1:                  type[C1] promotion[--] demotion[--] latency[000] usage[00000084] duration[00000000000000000000]
    C2:                  type[C2] promotion[--] demotion[--] latency[005] usage[00000505] duration[00000000000000154950]
    C3:                  type[C3] promotion[--] demotion[--] latency[020] usage[00372294] duration[00000000004035409863]
linux-zw94:/home/geins #

can i use this kernel in opensuse 10.3 too, because of the new rpm-compression?
Comment 32 uli geins 2008-07-13 18:49:04 UTC
Created attachment 227473 [details]
opensuse 11 kernel 2.6.25.10-0.2-default without nohz=off dmesg lspci lsmod (system crash)

after nearly 30 min the system is crashed
Comment 33 uli geins 2008-07-13 18:53:10 UTC
Hi Thomas,
i'm back again. it was too early to say something, because the system crashed, without any grub parameter and the kernelpatch.

here is the c-state with booting nohz=off and kernel 2.6.25.10-0.2:

linux-zw94:/home/geins # cat /proc/acpi/processor/*/power
active state:            C0
max_cstate:              C8
bus master activity:     00000000
maximum allowed latency: 2000000000 usec
states:
    C1:                  type[C1] promotion[--] demotion[--] latency[000] usage[00227328] duration[00000000000000000000]
    C2:                  type[C2] promotion[--] demotion[--] latency[005] usage[00000000] duration[00000000000000000000]
    C3:                  type[C3] promotion[--] demotion[--] latency[020] usage[00000000] duration[00000000000000000000]
linux-zw94:/home/geins #

now the system is working. as attachment i leave you the same above with booting nohz=off

Comment 34 uli geins 2008-07-13 18:57:08 UTC
Created attachment 227475 [details]
opensuse 11 kernel 2.6.25.10-0.2-default with nohz=off dmesg lsmod working system

system works
Comment 35 Thomas Renninger 2008-07-14 13:48:59 UTC
Can you also attach or paste /proc/cpuinfo, pls.
I hope the next try works out...
Comment 36 uli geins 2008-07-14 17:15:09 UTC
linux-zw94:/home/geins # cat /proc/cpuinfo
processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 15
model           : 76
model name      : Mobile AMD Sempron(tm) Processor 3500+
stepping        : 2
cpu MHz         : 1800.000
cache size      : 512 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext fxsr_opt rdtscp lm 3dnowext 3dnow up pni cx16 lahf_lm extapic cr8_legacy ts fid vid ttp tm stc
bogomips        : 3620.32
clflush size    : 64

linux-zw94:/home/geins #
Comment 37 Thomas Renninger 2008-07-14 21:15:34 UTC
Can you give this one another try, pls:
ftp.suse.com/pub/people/trenn/sempron_and_legacy_irq_fixed_kernel

You can check:
dmesg |grep -i sempron
You should see a line that no_hz got disabled then.
If not, I mixed up some srings again..., otherwise it should work now?
Comment 38 uli geins 2008-07-15 13:16:01 UTC
ok, i have done. it is that what you mean? --> disabling tickless idle

linux-zw94:/home/geins # dmesg |grep -i sempron
CPU0: AMD Mobile AMD Sempron(tm) Processor 3500+ stepping 02
Unpacking initramfs...<6>AMD Sempron Mobile found- disabling tickless idle
powernow-k8: Found 1 Mobile AMD Sempron(tm) Processor 3500+ processors (1 cpu cores) (version 2.20.00)



linux-zw94:/home/geins # cat /proc/acpi/processor/*/power
active state:            C0
max_cstate:              C8
bus master activity:     00000000
maximum allowed latency: 2000000000 usec
states:
    C1:                  type[C1] promotion[--] demotion[--] latency[000] usage[00177281] duration[00000000000000000000]
    C2:                  type[C2] promotion[--] demotion[--] latency[005] usage[00000000] duration[00000000000000000000]
    C3:                  type[C3] promotion[--] demotion[--] latency[020] usage[00000000] duration[00000000000000000000]
linux-zw94:/home/geins #

i will give you later a result. at the moment the system is running without nohz=off grub parameter.
Comment 39 Thomas Renninger 2008-07-15 13:53:25 UTC
This should work, thanks.
I will still wait a bit and then send it upstream for review and add it then.
I wait with closing until I really put it in.

If there should still be problems, pls tell me.
If I should not have closed the bug the next days, pls ping me...

..and of course: Thanks for testing!
Comment 40 uli geins 2008-07-16 15:18:03 UTC
hi thomas,

ok it works fine. when opensuse11.1 comes out is there the same problem or is the patch you are going to for the future. is this patch also availible for opensuse 10.3?
Comment 41 Thomas Renninger 2008-07-16 16:10:55 UTC
It's not worth it for 10.3, just use no_hz=off.
I will submit the patch upstream and CC or better set a:
Tested-by:...
this increases chances that things go in.
Comment 42 Thomas Renninger 2008-07-17 10:35:10 UTC
I added the patch to 11.0, so the bug is fixed for 11.0.
I will still keep it open for a while as a reminder that something still must go mainline...
Comment 43 Forgotten User aFJloKvMbR 2009-02-08 17:53:20 UTC
Thomas, Did you patch get into 11.1?
I'm asking because I could not start my openSUSE 11.1 sempron laptop without hohz=off parameter.
(And this bug is still open ..)
Comment 44 Thomas Renninger 2009-02-16 20:11:54 UTC
Sorry, I've lost track.
Ah yes, this got discussed/debugged mainline:
http://kerneltrap.org/mailarchive/linux-kernel/2008/6/16/2142494

IIRC my patch simply disables no_hz and/or highres for (some?) Semprons which is not perfect...
Unfortunately debugging got stuck and there was no real outcome.

You could follow up the thread and provide Thomas Gleixner with the relevant info he needs. This should do someone who is familiar with kernel building and should be done on latest kernels, best latest 2.6.29-rcX.

I could look at it, but without even having a machine to reproduce,
I am the wrong guy to try to find the root cause. Especially remotely, this would just be too much waste of time for all of us, sorry.

On a quick shot you could try latest 29-rcX kernel from here:
ftp://ftp.suse.com/pub/projects/kernel/kotd/HEAD/x86_64/
you need -default and -default-base packages. Maybe things got magically fixed up meanwhile.

I am going to close this one for now as won't fix. If things got fixed mainline and you (helped or) find out, please still post this in this bug and things could still get backported.