Bug 103028

Summary: rcpowersave start gives strange messages
Product: [openSUSE] SUSE LINUX 10.0 Reporter: Andreas Jaeger <aj>
Component: Mobile DevicesAssignee: Thomas Renninger <trenn>
Status: RESOLVED FIXED QA Contact: E-mail List <qa-bugs>
Severity: Normal    
Priority: P5 - None CC: mark.langsdorf
Version: Beta 1   
Target Milestone: ---   
Hardware: Other   
OS: All   
Whiteboard:
Found By: Other Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---

Description Andreas Jaeger 2005-08-09 13:44:28 UTC
rcpowersave start shows the following in /var/log/messages:

cpu_init done, current fid 0xe, vid 0x8
limiting to cpu 4 failed
limiting to cpu 5 failed
limiting to cpu 6 failed
limiting to cpu 7 failed
limiting to cpu 8 failed
limiting to cpu 9 failed
limiting to cpu 10 failed
limiting to cpu 11 failed
limiting to cpu 12 failed
limiting to cpu 13 failed
limiting to cpu 14 failed
limiting to cpu 15 failed
limiting to cpu 16 failed
limiting to cpu 17 failed
limiting to cpu 18 failed
limiting to cpu 19 failed
limiting to cpu 20 failed
limiting to cpu 21 failed
limiting to cpu 22 failed
limiting to cpu 23 failed
limiting to cpu 24 failed
limiting to cpu 25 failed
limiting to cpu 26 failed
limiting to cpu 27 failed
limiting to cpu 28 failed
limiting to cpu 29 failed
limiting to cpu 30 failed
limiting to cpu 31 failed
limiting to cpu 32 failed
limiting to cpu 33 failed
limiting to cpu 34 failed
limiting to cpu 35 failed
limiting to cpu 36 failed
limiting to cpu 37 failed
limiting to cpu 38 failed
limiting to cpu 39 failed
limiting to cpu 40 failed
limiting to cpu 41 failed
limiting to cpu 42 failed
limiting to cpu 43 failed
limiting to cpu 44 failed
limiting to cpu 45 failed
limiting to cpu 46 failed
limiting to cpu 47 failed
limiting to cpu 48 failed
limiting to cpu 49 failed
limiting to cpu 50 failed
limiting to cpu 51 failed
limiting to cpu 52 failed
limiting to cpu 53 failed
limiting to cpu 54 failed
limiting to cpu 55 failed
limiting to cpu 56 failed
limiting to cpu 57 failed
limiting to cpu 58 failed
limiting to cpu 59 failed
limiting to cpu 60 failed
limiting to cpu 61 failed
limiting to cpu 62 failed
limiting to cpu 63 failed
limiting to cpu 64 failed
limiting to cpu 65 failed
limiting to cpu 66 failed
limiting to cpu 67 failed
limiting to cpu 68 failed
limiting to cpu 69 failed
limiting to cpu 70 failed
limiting to cpu 71 failed
limiting to cpu 72 failed
limiting to cpu 73 failed
limiting to cpu 74 failed
limiting to cpu 75 failed
limiting to cpu 76 failed
limiting to cpu 77 failed
limiting to cpu 78 failed
limiting to cpu 79 failed
limiting to cpu 80 failed
limiting to cpu 81 failed
limiting to cpu 82 failed
limiting to cpu 83 failed
limiting to cpu 84 failed
limiting to cpu 85 failed
limiting to cpu 86 failed
limiting to cpu 87 failed
limiting to cpu 88 failed
limiting to cpu 89 failed
limiting to cpu 90 failed
limiting to cpu 91 failed
limiting to cpu 92 failed
limiting to cpu 93 failed
limiting to cpu 94 failed
limiting to cpu 95 failed
limiting to cpu 96 failed
limiting to cpu 97 failed
limiting to cpu 98 failed
limiting to cpu 99 failed
limiting to cpu 100 failed
limiting to cpu 101 failed
limiting to cpu 102 failed
limiting to cpu 103 failed
limiting to cpu 104 failed
limiting to cpu 105 failed
limiting to cpu 106 failed
limiting to cpu 107 failed
limiting to cpu 108 failed
limiting to cpu 109 failed
limiting to cpu 110 failed
limiting to cpu 111 failed
limiting to cpu 112 failed
limiting to cpu 113 failed
limiting to cpu 114 failed
limiting to cpu 115 failed
limiting to cpu 116 failed
limiting to cpu 117 failed
limiting to cpu 118 failed
limiting to cpu 119 failed
limiting to cpu 120 failed
limiting to cpu 121 failed
limiting to cpu 122 failed
limiting to cpu 123 failed
limiting to cpu 124 failed
limiting to cpu 125 failed
limiting to cpu 126 failed
limiting to cpu 127 failed
Comment 1 Forgotten User ZhJd0F0L3x 2005-08-09 14:15:06 UTC
looks like a kernel bug.
which processor?
which chipset?
which cpufreq driver? (install cpufrequtils and run cpufreq-info)
Comment 2 Andreas Jaeger 2005-08-09 14:38:30 UTC
2 dual core opteron processors
tyan S2885

cpufreq-info reports:
cpufrequtils 0.3: cpufreq-info (C) Dominik Brodowski 2004
Report errors and bugs to linux@brodo.de, please.
analyzing CPU 0:
  driver: powernow-k8
  CPUs which need to switch frequency at the same time: 0 1
  hardware limits: 1.80 GHz - 2.20 GHz
  available frequency steps: 2.20 GHz, 2.00 GHz, 1.80 GHz
  available cpufreq governors: ondemand, userspace, powersave, performance
  current policy: frequency should be within 1.80 GHz and 2.20 GHz.
                  The governor "ondemand" may decide which speed to use
                  within this range.
  current CPU frequency is 1.80 GHz (asserted by call to hardware).
analyzing CPU 1:
  driver: powernow-k8
  CPUs which need to switch frequency at the same time: 0 1
  hardware limits: 1.80 GHz - 2.20 GHz
  available frequency steps: 2.20 GHz, 2.00 GHz, 1.80 GHz
  available cpufreq governors: ondemand, userspace, powersave, performance
  current policy: frequency should be within 1.80 GHz and 2.20 GHz.
                  The governor "ondemand" may decide which speed to use
                  within this range.
  current CPU frequency is 1.80 GHz (asserted by call to hardware).
analyzing CPU 2:
  driver: powernow-k8
  CPUs which need to switch frequency at the same time: 2 3
  hardware limits: 1.80 GHz - 2.20 GHz
  available frequency steps: 2.20 GHz, 2.00 GHz, 1.80 GHz
  available cpufreq governors: ondemand, userspace, powersave, performance
  current policy: frequency should be within 1.80 GHz and 2.20 GHz.
                  The governor "ondemand" may decide which speed to use
                  within this range.
  current CPU frequency is 1.80 GHz (asserted by call to hardware).
analyzing CPU 3:
  driver: powernow-k8
  CPUs which need to switch frequency at the same time: 2 3
  hardware limits: 1.80 GHz - 2.20 GHz
  available frequency steps: 2.20 GHz, 2.00 GHz, 1.80 GHz
  available cpufreq governors: ondemand, userspace, powersave, performance
  current policy: frequency should be within 1.80 GHz and 2.20 GHz.
                  The governor "ondemand" may decide which speed to use
                  within this range.
  current CPU frequency is 1.80 GHz (asserted by call to hardware).
Comment 3 Thomas Renninger 2005-08-09 14:42:01 UTC
This is a kernel bug in powernow-k8 (maybe above?).
Seems as if the code iterates over all theoretically allowed CPUs, instead of
found CPUs.
Mark, do you know about this one?
Comment 4 Andreas Jaeger 2005-08-11 09:00:43 UTC
Btw.

$ ls /sys/devices/system/cpu/

cpu0    cpu104  cpu110  cpu117  cpu123  cpu15  cpu21  cpu28  cpu34  cpu40  cpu47
 cpu53  cpu6   cpu66  cpu72  cpu79  cpu85  cpu91  cpu98
cpu1    cpu105  cpu111  cpu118  cpu124  cpu16  cpu22  cpu29  cpu35  cpu41  cpu48
 cpu54  cpu60  cpu67  cpu73  cpu8   cpu86  cpu92  cpu99
cpu10   cpu106  cpu112  cpu119  cpu125  cpu17  cpu23  cpu3   cpu36  cpu42  cpu49
 cpu55  cpu61  cpu68  cpu74  cpu80  cpu87  cpu93
cpu100  cpu107  cpu113  cpu12   cpu126  cpu18  cpu24  cpu30  cpu37  cpu43  cpu5
  cpu56  cpu62  cpu69  cpu75  cpu81  cpu88  cpu94
cpu101  cpu108  cpu114  cpu120  cpu127  cpu19  cpu25  cpu31  cpu38  cpu44  cpu50
 cpu57  cpu63  cpu7   cpu76  cpu82  cpu89  cpu95
cpu102  cpu109  cpu115  cpu121  cpu13   cpu2   cpu26  cpu32  cpu39  cpu45  cpu51
 cpu58  cpu64  cpu70  cpu77  cpu83  cpu9   cpu96
cpu103  cpu11   cpu116  cpu122  cpu14   cpu20  cpu27  cpu33  cpu4   cpu46  cpu52
 cpu59  cpu65  cpu71  cpu78  cpu84  cpu90  cpu97

Why do I have 128 cpus there?
Comment 5 Thomas Renninger 2005-08-15 16:03:27 UTC
I expect that either powernow-k8 driver uses current->cpus_allowed in a wrong
way, or this is broken somewhere higher.

Could someone of the kernel hackers have a look at that, I have not much time at
the moment -> need to setup a lot new hardware.

You can use *willimas* to test
Serial console is on the machine: *sconsole1* -> type in: cscreen
the remote powersave switch is rpower3 -> telnet.

As this seems to have to do with a scheduler variable -> assigning to Nick.
Ask me if you have problem with the machine.
Comment 6 Andreas Kleen 2005-08-15 19:39:18 UTC
I noticed it myself on willimas before, but never acted on it ;-)

NR_CPUS is 128 CPUs, so it looks it is trying all possible CPUs.

It could be this loop:

        for (i=0; i<NR_CPUS; i++) {
                if (!cpu_online(i))
                        continue;
                if (check_supported_cpu(i))
                        supported_cpus++;
        }

however this would imply that cpu_online() is broken, which would
be worrying. There were recently some changes in this area
for CPU hotplug support.
Comment 7 Mark Langsdorf 2005-08-18 19:49:45 UTC
That loop has been in the driver for two years without changes and without 
error messages from CPU 127. 

I suspect cpu_online() needs to be investigated.
Comment 8 Andreas Kleen 2005-11-21 15:12:28 UTC
It's related to cpu hotplug (128 cpus) and sysfs.

I just disabled the printk for now. That's a kludge, in mainline we can 
fix it better.