Bug 1220290

Summary: Powersave governor puts CPU near max mhz
Product: [openSUSE] openSUSE Distribution Reporter: Rory A <rory.ashton>
Component: KernelAssignee: Giovanni Gherdovich <giovanni.gherdovich>
Status: CONFIRMED --- QA Contact: E-mail List <qa-bugs>
Severity: Normal    
Priority: P5 - None CC: rory.ashton, tiwai
Version: Leap 15.5Flags: giovanni.gherdovich: needinfo? (rory.ashton)
Target Milestone: ---   
Hardware: x86-64   
OS: openSUSE Leap 15.5   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: 15.4 Leap rescue prompt showing CPU Mhz and CPU Gov = powersave
script to collect 20 seconds of "mpstat" and "turbostat" monitoring
cpu test output
full install test results - openSUSE-Leap-15.5-DVD-x86_64-Build491.1
intel_pstate=passive lets me user 'powersave' to low MHz , but it does not scale up correctly
monitoring script: irq time, turbostat, mpstat
screenshot, annotated turbostat output from comment 6
Landscape of cpufreq on Intel x86_64, three different "powersave" governors
test results from Leap 15.5 XFCE live system
test 2 from diff laptop , XFCE Live Leap 15.5
Update monitoring script. Reads /proc/interrupts and /proc/softirqs
Script results from Live ISO and Leap using 2 different kernels
test results after options i2c-i801 disable_features=0x10

Description Rory A 2024-02-24 20:05:22 UTC
Downloaded and installed this onto an Acer with Intel CPU.

openSUSE-Leap-15.5-DVD-x86_64-Build491.1

With no apps running and top showing little activity the CPU Mhz are very near their max.  In other distros on the same laptop Powersave governor usually idles around 800mhz or so.  OpenSuse at idle seeing over 2000mhz.

This laptop often used unplugged so 'powersave' is essential.
Comment 1 hui 2024-02-24 20:52:33 UTC
Seems like a duplicate of bug 1201013
Comment 2 Rory A 2024-02-27 02:30:22 UTC
tl;dr - The bug also happens in Tumblweed with slightly different mhz values.  Tumbleweed does not have powersave gov like Leap does.

The laptop in question now has a different distro installed since bug report, but I wanted to double check something so this is what I did.

Download : openSUSE-Leap-15.5-NET-x86_64-Build491.1-Media
Boot into terminal off USB
Min scaling freq was set correctly (800000) 800mhz

Then I double checked in Tumbleweed.

Download : openSUSE-Tumbleweed-NET-x86_64-Snapshot20240225-Media
Boot into terminal off USB
Noted on here the frequencies were a little more 'up and down' (looking at /proc/cpuinfo)
Noted that Tumbleweed has different scaling options : 
ondemand, performance, schedutil

schedutil was the active gov and cpu0 / cpu1 frequencies were going up and down regularly with cpu0 sometimes at idle, sometimes at ~1775mhz, noting in the orig bug the freq was pegged almost always above 2000mhz.  In tumbleweed / schedutil cpu1 was never at idle but often bouncing just above idle.
Comment 3 Rory A 2024-02-27 03:42:49 UTC
Created attachment 873017 [details]
15.4 Leap rescue prompt showing CPU Mhz and CPU Gov = powersave
Comment 4 Rory A 2024-02-27 03:44:36 UTC
Downloaded prev version Leap, 15.4  Same issue there but cpu1 is the run away core instead of cpu0.  CPU gov was set to powersave.  Checked from a rescue prompt.

openSUSE-Leap-15.4-NET-x86_64-Build243.2-Media
Comment 5 Giovanni Gherdovich 2024-02-27 08:30:14 UTC
Created attachment 873021 [details]
script to collect 20 seconds of "mpstat" and "turbostat" monitoring

Hello Rory,

can you run the attached script on Leap 15.5, then add the resulting tarball to the bug?
The script runs "mpstat" and "turbostat", for which you'll need to install the packages "sysstat" and "cpupower" respectively. "Turbostat" needs to run as root, so you'll need to invoke with:

  $ chmod +x bsc-12-20-290-monitor.sh
  $ sudo ./bsc-12-20-290-monitor.sh

"mpstat" gathers CPU utilization, "turbostat" is for the clock frequency statistics. They'll run for 20 seconds, then pack the output in a tarball. The script will also include the kernel command line and the "lscpu" output.

The reason I'm asking for this data collection is "turbostat" is more reliable than /proc/cpuinfo, and it'll help us establish that the problem is real and not an artifact of monitoring.
Comment 6 Rory A 2024-02-28 03:45:23 UTC
Created attachment 873065 [details]
cpu test output

Hi and thanks for the info.  Here is the output.  This is from 

openSUSE-Leap-15.5-Rescue-CD-x86_64-Media.iso

I notice on here it seems both cores are maxed out, on the other ISOs I tried they tended to max out 1 core and leave the other idle.  Will re-run again on another ISO.
Comment 7 Rory A 2024-02-28 05:20:57 UTC
Created attachment 873067 [details]
full install test results - openSUSE-Leap-15.5-DVD-x86_64-Build491.1

Attachment is test results from full install on : 

openSUSE-Leap-15.5-DVD-x86_64-Build491.1-Media.iso

prev test results were from Leap-15.5-Rescue-CD

wanted to do a 'real world' test off an install
Comment 8 Rory A 2024-02-28 07:42:19 UTC
Created attachment 873073 [details]
intel_pstate=passive lets me user 'powersave' to low MHz , but it does not scale up correctly

With kernel boot option intel_pstate=passive (or disable) I am able to set gov to powersave and the cpu no longer runs near max mhz.  But also does not scale up with load.  When I try other gov such as ondemand, conservative, etc, the CPU stays high and never scales down.

But at least the system is somewhat usable and I can manually control the CPU gov as needed, not ideal, but workable.

When I first boot with intel_pstate=passive 'ondemand' is active and the CPU is high.  Powersave is not listed as available.  But when I set the gov to powersave the CPU MHz return to normal and then powersave shows up under 

scaling_governor 
scaling_available_governors

Linux localhost 5.14.21-150500.53-default #1 SMP PREEMPT_DYNAMIC Wed May 10 07:56:26 UTC 2023 (b630043) x86_64 x86_64 x86_64 GNU/Linux

Thank you for your time and attention.

Kind regards
Comment 9 Rory A 2024-02-28 07:45:46 UTC
>"lets me user 'powersave' to low MHz , but it does not scale up correctly"

lets me use 'powersave' to lower MHz
Comment 10 Giovanni Gherdovich 2024-02-29 16:12:22 UTC
Created attachment 873124 [details]
monitoring script: irq time, turbostat, mpstat

Hello Rory,

your data captures from comment 6 and comment 7 confirm the problem is real. Given that they look the same, from now on you can use the Leap 15.5 live CD instead of a full installation, in case it's more practical.

1. SUMMARY: CPU-0 IS ONLY 25% IDLE, 35,000 IRQS/SEC
2. NEW REQUEST FOR DATA
3. ANALYSIS OF CURRENT DATA: FAST CLOCK CONFIRMED, HIGH BUSY%, MANY IRQS


1. SUMMARY: CPU-0 IS ONLY 25% IDLE, 35,000 IRQS/SEC
---------------------------------------------------
The reason CPU-0 is going at ~2GHz is that it is, actually, under load. It's active 75% of the time (should be close to zero instead). Turbostat shows 35K irqs/sec on that CPU which seems a little high.


2. NEW REQUEST FOR DATA
-----------------------
I'm attaching another script for data collection. It'll need the installation of package "bcc-tools", which is a collection of monitoring utilities based on the BPF in-kernel virtual machine. It's still collecting turbostat and mpstat data, so those tools are still needed. The bcc tools we'll be using are "hardirqs" and "softirqs", which measure time spent on "top halves" and "bottom halves" respectively. We're expecting some 75% of CPU time spent serving them, ie the unaccounted time discrepancy between turbostat (75% of time spent doing work) and mpstat (~0% of time spent between usermode and kernelmode). By the name of the IRQ handlers we'll hopefully understand which kind of IRQs are we talking about. Run it like:

  $ chmod +x bsc-12-20-290-monitor-2.sh
  $ sudo ./bsc-12-20-290-monitor-2.sh  


3. ANALYSIS OF CURRENT DATA: FAST CLOCK CONFIRMED, HIGH BUSY%, MANY IRQS
------------------------------------------------------------------------
What I'm seeing in the turbostat output from comment 6 is:

* [turbostat, Avg_MHz]: the "Avg_MHz" column gives ~2000 MHz on CPU 0, and ~10 MHz on CPU 1. This is the same you observed from /proc/cpuinfo. Avg_MHz is the average frequency over the sampling period (1 second), including the idle time of the processor. Since your machine is idle, we expect the idle time to be almost 100%, and Avg_MHz to be just a few MHz as a result. But it isn't.

* [turbostat, Busy%]: the "Busy%" column gives ~75% on CPU 0, and about zero on CPU 1. "Busy%" indicates how much the CPU is active, meaning it's not turned off into a hardware idle state (also called C-State). This is why Avg_MHz is high on CPU 0: that CPU isn't idle at all, it's actually active about 75% of the time. What is it doing?

* [mpstat, idle]: the mpstat output shows it isn't running usermode code, and isn't running regular kernelmode neither. In fact, mpstat shows both CPUs are idle almost 100% of the time. Here "idle" means something different: the CPU time isn't accounted either as usermode or kernelmode, but isn't necessarily switched off (C-State discussed above for "Busy%"). Using the "jq" program (package "jq") to query the mpstat json output, I can do

  $ jq '.sysstat.hosts | .[] | .statistics | .[] | ."cpu-load" | .[] | select(.cpu=="0").idle | tonumber | round' mpstat.txt
  93 92 100 92 97 97 97 97 97 94 87 97 93 97 97 97 97 93 78 84 

  $ jq '.sysstat.hosts | .[] | .statistics | .[] | ."cpu-load" | .[] | select(.cpu=="1").idle | tonumber | round' mpstat.txt
  100 94 99 100 97 99 99 100 97 100 96 100 98 100 99 100 100 100 95 97 

* [turbostat, IRQ]: back to turbostat, the IRQ column shows CPU 0 is processing some 35K interrupt/second, which seems high. For example on my laptop I don't have that many (see below). I think this is what's keeping CPU 0 busy and driving up the clock.

    # turbostat --interval 1 --num_iterations 3 --quiet --show Core,CPU,IRQ
    Core    CPU     IRQ
    -       -       1717
    0       0       495
    0       2       389
    1       1       514
    1       3       319
    Core    CPU     IRQ
    -       -       1612
    0       0       481
    0       2       408
    1       1       464
    1       3       259
    Core    CPU     IRQ
    -       -       1636
    0       0       473
    0       2       415
    1       1       494
    1       3       254
Comment 11 Giovanni Gherdovich 2024-02-29 16:16:11 UTC
Created attachment 873126 [details]
screenshot, annotated turbostat output from comment 6

turbostat output from comment 6, annotated.

Avg_MHz is high on CPU 0 (~2GHz, idle machine). This is the reported problem.
Busy% is high on CPU 0 (CPU active ~75% of the time). This is why the clock is fast: the machine is under load, even if the user isn't doing anything.
IRQ are coming in at a rate of 35K/sec, which is high.

Current working hypothesis is the CPU time is spent serving interrupts.
Comment 12 Giovanni Gherdovich 2024-02-29 16:37:56 UTC
Created attachment 873129 [details]
Landscape of cpufreq on Intel x86_64, three different "powersave" governors

(In reply to Rory A from comment #8)
> 
> intel_pstate=passive lets me user 'powersave' to low MHz , but it does not
> scale up correctly
> 
> With kernel boot option intel_pstate=passive (or disable) I am able to set
> gov to powersave and the cpu no longer runs near max mhz.  But also does not
> scale up with load.

Yeah I know it's confusing, but the "powersave" governor you get with intel_pstate=passive is not the same "powersave" you get under nominal conditions (default).

I'm attaching a diagram (dated 2020, but still informative) of the various cpufreq hardware/governor combinations you can get on x86. It says "server", but applies to laptop/desktop too.

* The "powersave" you get with intel_pstate=passive (labeled #1 in the diagram) always run at the minimum clock (like you say, doesn't scale with load -- it always runs at the minimum clock).

* By default, Leap 15.5 on Intel gives you another "powersave": a governor made expressly for the intel_pstate driver. This can do one of two things, depending on the processor type:

  * on Atom processors such as yours, where the HWP feature isn't available, it scales the frequency according to the formula "Busy% * max_freq * 1.25". Labeled #3 in the diagram.

  * on server and client processors (non-Atom), if later than Skylake (2015), it configures the HWP feature (also called "Intel Speed Shift") setting Energy Performance Preference (EPP) to 128, which is mid-way between performance-oriented (EPP=0) and efficiency-oriented (EPP=255). That's label #2 in the diagram.

The diagram I'm posting here works like this: follow the graph from "start" to a leave leaf, the path you get corresponds to a possible configuration.
Comment 13 Rory A 2024-03-01 02:21:54 UTC
Created attachment 873142 [details]
test results from Leap 15.5 XFCE live system

> 2. NEW REQUEST FOR DATA
> -----------------------
> I'm attaching another script for data collection. It'll need the
> installation of package "bcc-tools", which is a collection of monitoring
> utilities based on the BPF in-kernel virtual machine. It's still collecting
> turbostat and mpstat data, so those tools are still needed. The bcc tools
> we'll be using are "hardirqs" and "softirqs", which measure time spent on
> "top halves" and "bottom halves" respectively. We're expecting some 75% of
> CPU time spent serving them, ie the unaccounted time discrepancy between
> turbostat (75% of time spent doing work) and mpstat (~0% of time spent
> between usermode and kernelmode). By the name of the IRQ handlers we'll
> hopefully understand which kind of IRQs are we talking about. Run it like:

Hi Giovanni,

Thanks for the info.  

This is a testing laptop.  I have openSUSE Leap 15.5 installed but did some updates last night, so to continue testing under a default config I have downloaded the Leap 15.5 XFCE ISO and booted that.

openSUSE-Leap-15.5-XFCE-Live-x86_64-Media.iso

uname -a
5.14.21-150500.55.39-default #1 SMP PREEMPT_DYNAMIC

I installed bcc-tools and ran the command.  It did seem to complete but output this error.  I have attached its output.

Running turbostat, mpstat, hardirqs, softirqs for 20 seconds
modprobe: FATAL: Module kheaders not found in directory /lib/modules/5.14.21-150500.55.39-default
chdir(/lib/modules/5.14.21-150500.55.39-default/build): No such file or directory
Traceback (most recent call last):
  File "/usr/share/bcc/tools/softirqs", line 181, in <module>
    b = BPF(text=bpf_text)
  File "/usr/lib/python3.6/site-packages/bcc/__init__.py", line 479, in __init__
    raise Exception("Failed to compile BPF module %s" % (src_file or "<text>"))
Exception: Failed to compile BPF module <text>
modprobe: FATAL: Module kheaders not found in directory /lib/modules/5.14.21-150500.55.39-default
chdir(/lib/modules/5.14.21-150500.55.39-default/build): No such file or directory
Traceback (most recent call last):
  File "/usr/share/bcc/tools/hardirqs", line 224, in <module>
    b = BPF(text=bpf_text)
  File "/usr/lib/python3.6/site-packages/bcc/__init__.py", line 479, in __init__
    raise Exception("Failed to compile BPF module %s" % (src_file or "<text>"))
Exception: Failed to compile BPF module <text>
Saved 2024-02-28.20-49-50.tgz

I will follow up on your other messages later.

Kind regards
Rory
Comment 14 Rory A 2024-03-01 02:40:56 UTC
Created attachment 873143 [details]
test 2 from diff laptop , XFCE Live Leap 15.5

Test results from a different laptop
Comment 15 Rory A 2024-03-01 03:22:58 UTC
> * The "powersave" you get with intel_pstate=passive (labeled #1 in the
> diagram) always run at the minimum clock (like you say, doesn't scale with
> load -- it always runs at the minimum clock).

Thanks for the diagram and info - very appreciated.  It's interesting that driver intel_cpufreq has 2 different 'powersave' modes, the normal one and the intel_pstate=passive one.

When I use kernel option

intel_pstate=passive

I still seem to get the driver

intel_cpufreq

Commands :

# cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_driver
intel_cpufreq

# cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-5.14.21-150500.55.49-default root=UUID=9cd4eb94-ecc9-4e0e-bf60-e157df2ecb16 splash=silent resume=/dev/disk/by-uuid/bb0bb873-36e0-4cf5-ae7a-1c5fcc7d6514 preempt=full mitigations=auto intel_pstate=passive quiet security=apparmor

Whereas if I boot with intel_pstate=disable I do end up with acpi-cpufreq

https://bugzilla.suse.com/attachment.cgi?bugid=1220290&action=enter

# cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-5.14.21-150500.55.49-default root=UUID=9cd4eb94-ecc9-4e0e-bf60-e157df2ecb16 splash=silent resume=/dev/disk/by-uuid/bb0bb873-36e0-4cf5-ae7a-1c5fcc7d6514 preempt=full mitigations=auto intel_pstate=disable quiet security=apparmor

# cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_driver
acpi-cpufreq
Comment 16 Rory A 2024-03-01 03:32:15 UTC
As a side note I am able to make the system 'more usable' by booting into the working powersave move then adjusting the min CPU MHz values thus speeding up the CPU.  

echo 1200000 | tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_min_freq

Then when I'm done with whatever CPU intensive task I slow it back down.

echo 800000 | tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_min_freq

Or I can change to the performance scaling_governor for a minute or two while I open firefox, or do an update, then change back to powersave.

For other users here is a bash script that makes 'manually governing' the CPU a bit easier.  But you'll want to adjust the min/max based on your CPU.  And presently 'E' for edit runs leafpad.  Replace that with whatever your editor is.

I hope posting this script here is not outside the rules!  Apologies if so.

This script refreshes every 2 seconds.

while :
do
clear
sgov=$(cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor)
echo "scaling_governor : " $sgov
echo -e ""
cat /proc/cpuinfo | grep "MHz"
echo -e ""
sdriver=$(cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_driver)
echo "scaling_driver : " $sdriver
echo -e ""
echo -e "1) Powersave \n2) Conservative \n3) Schedutil \n4) Performance \n5) Ondemand \n\n6) Max 1.4 GHz \n7) Max 2.5 GHz \n\n9) Min 900 MHz \n0) Min 1.2 GHz \n\nD) Defaults \nE) Edit \nQ) Quit"
echo ""
sminf=$(cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_min_freq)
smaxf=$(cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq)
echo "scaling_min_freq : " $sminf 
echo "scaling_max_freq : " $smaxf
read -rsn1 -t 1 input
if [ "$input" = "1" ]; then
    echo powersave | tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
fi
if [ "$input" = "2" ]; then
    echo conservative | tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
fi
if [ "$input" = "3" ]; then
    echo schedutil | tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
fi
if [ "$input" = "4" ]; then
    echo performance | tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
fi
if [ "$input" = "5" ]; then
    echo ondemand | tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
fi
if [ "$input" = "6" ]; then
    echo 1400000 | tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_max_freq
fi
if [ "$input" = "7" ]; then
    echo 2500000 | tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_max_freq
fi
if [ "$input" = "9" ]; then
    echo 900000 | tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_min_freq
fi
if [ "$input" = "0" ]; then
    echo 1200000 | tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_min_freq
fi

if [ "$input" = "d" ]; then
    echo 800000 | tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_min_freq
    echo 2700000 | tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_max_freq
fi
if [ "$input" = "q" ]; then
    exit
fi
if [ "$input" = "e" ]; then
    	leafpad mon.sh
	exit
fi
sleep 2
done
Comment 17 Rory A 2024-03-01 04:00:18 UTC
(In reply to Rory A from comment #16)
> As a side note I am able to make the system 'more usable' by booting into
> the working powersave move then adjusting the min CPU MHz values thus
> speeding up the CPU.  
> 
> echo 1200000 | tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_min_freq
> 
> Then when I'm done with whatever CPU intensive task I slow it back down.
> 
> echo 800000 | tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_min_freq

Sidenote to others trying to make their system usable while trouble shooting this - booting with intel_pstate=disable gives driver acpi-cpufreq, and it does not respect any limits I manually set like this

echo 1400000 | tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_max_freq

But if I boot with intel_pstate=passive I still get the driver intel_cpufreq
 which will respect these limits AND let me use 'powersave' to idle the cpu.

Sorry if this is too offtopic for the bug report.  Trying to help other people coming from Google.
Comment 18 Rory A 2024-03-01 06:54:19 UTC
Hi Giovanni,

I did a full install of openSUSE Tumbleweed and the CPU is scaling very well, but I am happy to keep testing on this issue off the Live USB.  Just wanted to note this is working well.

Sorry for so many comments!

Kind regards

Kernel : 6.7.6-1-default #1 SMP PREEMPT_DYNAMIC
Comment 19 Rory A 2024-03-01 07:31:16 UTC
> I did a full install of openSUSE Tumbleweed and the CPU is scaling very
> well

I dont want to get too off topic but what specifically works well is if I switch to the 'conservative' gov.  It scales up and down very well in Tumbleweed.
Comment 20 Rory A 2024-03-04 06:39:09 UTC
I do not think this is a openSUSE specific issue.  I was testing Slackware on kernel 5.15 this weekend and saw the same issue on a different intel based laptop.  CPU cores near/at max MHz for long periods under minimal load.
Comment 21 Giovanni Gherdovich 2024-03-07 12:53:44 UTC
Hello Rory,

please let's keep the focus on one line of investigation at a time. I appreciated that you've reported this bug, and your inclination to gather additional data, but there are more productive ways to proceed, and less productive ways.
The more you write, the more I need to read before I can reply, which consequently takes more time before we reach the root cause of the problem. Also, my replies are longer as I need to address all your points, so that's in turn is more to read for you, more text makes misunderstandings more likely, and the whole story takes forever.

I suggest we focus on running the script I attached at comment 10, without error. The output you show at comment 13 indicates something went wrong: "modprobe: FATAL: Module kheaders not found".
There's a dependency I forgot to mention (sorry), which is the kernel header files are needed for the bpf-based tools. You can install kernel headers with:

  zypper install kernel-default-devel

Then, run the comment 10 script again (using the setup from comment 0) and attach the output.
Comment 22 Giovanni Gherdovich 2024-03-07 12:54:21 UTC
Regarding comment 14:

This second laptop you're using (1) has a very different power management setup and (2) has a clock frequency that is totally fine (average freq very low). With respect to the diagram at comment 12, this second laptop follows the configuration path

  x86_64 → Intel → intel_pstate → active → HWP_on → powersave#2 (EPP=128)

while the other laptop, on which we're having the problem, is like

  x86_64 → Intel → intel_pstate → active → HWP_off → powersave #3

The clock freq of this second laptop is fine, as it's shown in the Avg_MHz column of the turbostat output. Here I've re-organized the data for clarity; the figures are in MHz (average freq including idle time):

	 1s  2s  3s  4s  5s  6s  7s  8s  9s  10s  11s  12s  13s  14s  15s  16s  17s  18s  19s  20s
  cpu0   10   1   6   1   3   9   5   7  18   14    2    7    5    2    7    2    2   12    2    2
  cpu1   74   2  10   2   4   8   2   3   6    1    2    2    2    2    3    1    2    4    2    2
  cpu2  223   1   9   1   0   6   1   1  14    2    1   14    1    1    1    1    1    5    1    1
  cpu3   20   4  13   3   5   5   3  10  24   16    3   14    3    4   11    3    3   19   20    6
  cpu4   10   1  10   1   1   4   1   3   7    1    1    7    2    2    6    0    1    5   16    4
  cpu5   28   1   6   1   1   5   1  10  23   25    1    6    1    1    7    2    1   14   34    6
  cpu6  216   6  10  15   9   9  11   8  11   11    8    8   14    6   27   10    9   21   12    6
  cpu7   10   2   3   2   1  17   2   2   3    2    2    3    2    2    3    2    2    3    2    2

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Regarding comment #15

When running intel_pstate in passive mode, the intel_pstate driver renames itself to "intel_cpufreq".  Please consult the kernel documentation page at https://www.kernel.org/doc/html/latest/admin-guide/pm/intel_pstate.html , it explains this.

When you disable intel_pstate, an alternative driver is used, "acpi_cpufreq". Again from the diagram at comment 12, this is the path

  x86_64 → Intel → acpi_cpufreq → ...


- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Regarding comment #16

There's a helper tool to set the cpufreq governor, "cpupower" from the "cpupower" package. In any event, if you feel you need a different governor just to opening the browser, that is a bug and it's what we're trying to address here. I saw a large number of interrupts on the 2 cores machine, and would like to know what they are (script at comment 10). If we manage to get to the bottom of this, you shouldn't need to change the governor every 2 seconds like you're suggesting at comment 16.

There is openSUSE documentation on the "cpupower" tool: https://doc.opensuse.org/documentation/leap/tuning/html/book-tuning/cha-tuning-power.html

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Regarding comment #17

I'm not sure why acpi_cpufreq isn't respecting limits, but if that's the case, that's different problem than what you reported at comment 0, and needs a different bugzilla ticket, for it to be addressed separately.
If we could limit the scope of the current investigation to the one problem you originally reported, we may have a chance to actually finding the cause and thinking of a solution.

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Regarding comment #18 and #19

If the original 2-cores laptop is working well under Tumbleweed and not under Leap, this is extremely valuable information. Tumbleweed runs straight unmodified upstream kernels, while Leap runs a kernel with our SUSE patches.
Your experience suggests there's a problem with our SUSE patches, and I'm very interested in your data, because said problem needs to be resolved.

But you also say you're running "conservative", which suggests this configuration path:

 x86_64 → Intel → intel_pstate → passive → HWP_off → conservative

That is (1) suboptimal, as intel machines are supposed to run intel_pstate active and (2) you've never reported about "conservative" on Leap, so we can't make a comparison with the data we saw earlier. I would like to see

  x86_64 → Intel → intel_pstate → active → HWP_off → powersave-3

which is the setup you had a comment 0 on the 2-cores machine.

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Regarding comment #20

Ok so it could be related to the kernel version. Leap 15.5 and your Slackware are 5.14 and 5.15 respectively, Tumbleweed is 6.something (6.6 I believe?).


The order of business now is: install kernel-default-devel, run the script from comment 10 on the setup you had at comment 0.
Comment 23 Rory A 2024-03-13 06:18:13 UTC
Unfortunately installing kernel-default-devel did not resolve the error.  Here is copy/paste from the terminal confirming I did install it, and the script errors.

linux@localhost:~> su -
localhost:/home/linux # zypper install kernel-default-devel
Loading repository data...
Reading installed packages...
'kernel-default-devel' is already installed.
No update candidate for 'kernel-default-devel-5.14.21-150500.55.52.1.x86_64'. The highest available version is already installed.
Resolving package dependencies...
Nothing to do.
localhost:/home/linux # ./bsc-12-20-290-monitor-2.sh 
Running turbostat, mpstat, hardirqs, softirqs for 20 seconds
modprobe: FATAL: Module kheaders not found in directory /lib/modules/5.14.21-150500.55.39-default
chdir(/lib/modules/5.14.21-150500.55.39-default/build): No such file or directory
Traceback (most recent call last):
  File "/usr/share/bcc/tools/softirqs", line 181, in <module>
    b = BPF(text=bpf_text)
  File "/usr/lib/python3.6/site-packages/bcc/__init__.py", line 479, in __init__
    raise Exception("Failed to compile BPF module %s" % (src_file or "<text>"))
Exception: Failed to compile BPF module <text>
modprobe: FATAL: Module kheaders not found in directory /lib/modules/5.14.21-150500.55.39-default
chdir(/lib/modules/5.14.21-150500.55.39-default/build): No such file or directory
Traceback (most recent call last):
  File "/usr/share/bcc/tools/hardirqs", line 224, in <module>
    b = BPF(text=bpf_text)
  File "/usr/lib/python3.6/site-packages/bcc/__init__.py", line 479, in __init__
    raise Exception("Failed to compile BPF module %s" % (src_file or "<text>"))
Exception: Failed to compile BPF module <text>
Saved 2024-03-13.01-10-01.tgz
Comment 24 Rory A 2024-03-15 06:59:04 UTC
Giovanni,

I see that Leap 15.6 Beta is out and that new kernel likely resolves this issue.  Given that, if you would like to close this ticket I understand.

Kind regards
Comment 25 Giovanni Gherdovich 2024-03-15 18:00:03 UTC
Created attachment 873558 [details]
Update monitoring script. Reads /proc/interrupts and /proc/softirqs

Hello Rory,

I'm happy that 15.6 beta works for you, but since 15.5 is going to be supported until December 2024, I'd like to at least check if the interrupt hypothesis holds water.
That is, if it's actually the high number of interrupts that keep the cpu active and prevent it from going idle, resulting in high average clock frequency.
Of course that is contingent on you having time and opportunity to gather data. No obligation.

If you agree to proceed, first we need to establish the kernel versions we're going to test.

There are two Leap 15.5 kernels I'm interested in:

- kernel-default-5.14.21-150500.53.2.x86_64.rpm
  This is the stock kernel that comes with a fresh Leap 15.5 installation, and is the one you used when you sent data at comment 6 and comment 7.
  It is found in the repository http://download.opensuse.org/distribution/leap/15.5/repo/oss

- kernel-default-5.14.21-150500.55.52.1.x86_64.rpm
  This is the latest kernel update for Leap 15.5, released last March 6.
  It is found in the repository http://download.opensuse.org/update/leap/15.5/sle

Then I'llask you to install kernel headers matching these versions (I suspect the errors at comment 13 and comment 23 are due to the headers version not matching the kernel version), and run the attached script on both.

Here are the steps:

1. [Check the zypper repositories]
   Make sure you have the "Leap 15.5 OSS" and "SLE 15.5 Update" repositories enabled. It is generally the case, but best be sure. You'll check that with the command:

   $ zypper repos --uri

   Among all repositories in the list, you should see two with the urls I mentioned above. If you don't, add them like so:

   $ zypper addrepo http://download.opensuse.org/distribution/leap/15.5/repo/oss openSUSE-Leap-15.5-Oss
   $ zypper addrepo http://download.opensuse.org/update/leap/15.5/sle repo-sle-update
   $ zypper refresh

2. [Search kernels using the "--details" option]
   Search for the two kernels we need, ie the initial Leap 15.5 kernel and the latest update:

   $ zypper search --type package --details /^kernel-default$/

   Since we're giving the "--details" option, we'll be seeing all versions, no only the latest one. The output should resemble the following:

   S  | Name           | Type    | Version                 | Arch   | Repository
   ---+----------------+---------+-------------------------+--------+-----------------------
   v  | kernel-default | package | 5.14.21-150500.55.52.1  | x86_64 | repo-sle-update
   v  | kernel-default | package | 5.14.21-150500.55.49.1  | x86_64 | repo-sle-update
   v  | kernel-default | package | 5.14.21-150500.55.44.1  | x86_64 | repo-sle-update
   v  | kernel-default | package | 5.14.21-150500.55.39.1  | x86_64 | repo-sle-update
   v  | kernel-default | package | 5.14.21-150500.55.36.1  | x86_64 | repo-sle-update
   v  | kernel-default | package | 5.14.21-150500.55.31.1  | x86_64 | repo-sle-update
   v  | kernel-default | package | 5.14.21-150500.55.28.1  | x86_64 | repo-sle-update
   v  | kernel-default | package | 5.14.21-150500.55.22.1  | x86_64 | repo-sle-update
   v  | kernel-default | package | 5.14.21-150500.55.19.1  | x86_64 | repo-sle-update
   v  | kernel-default | package | 5.14.21-150500.55.12.1  | x86_64 | repo-sle-update
   v  | kernel-default | package | 5.14.21-150500.55.7.1   | x86_64 | repo-sle-update
   i  | kernel-default | package | 5.14.21-150500.53.2     | x86_64 | openSUSE-Leap-15.5-Oss

   In the first column, "i" means "installed", "v" means "a different version is installed". Unlike other packages, there can be multiple versions of the kernel installed at the same time.

3. [Install kernels]
   Install (if they aren't already) the first and last of that list, meaning the initial 15.5 kernel and the latest update. This part is tricky! You'll need to specify both the package name (kernel-default, second column) and the version (fourth column). The way you do that is joining the name and the version with a "dash" sign (-), like so:

   $ zypper install kernel-default-5.14.21-150500.53.2
   $ zypper install kernel-default-5.14.21-150500.55.52.1

4. [Search with "--details" and install kernel headers with matching versions]
   Search for, and install, the kernel-default-devel packages (which contains kernel headers, needed for that script to collect interrupt data) with version matching those of the kernels we just installed. Again, I think a version mismatch between kernel and headers is the reason of the errors at comment 13 and comment 23. Very similar process as before:

   $ zypper search --type package --details /^kernel-default-devel$/
   $ zypper install kernel-default-devel-VERSION-GOES-HERE

5. [Collect data for kernel 5.14.21-150500.53.2]
   Reboot, and use the grub menu to select the 5.14.21-150500.53.2 kernel. Run the attached script, which should hopefully work. Attach the .tgz file it produces.

6. [Collect data for kernel 5.14.21-150500.55.52.1]
   Same story for the other kernel (latest update). Attach the result.

7. [Collect system information with "supportconfig"]
   There is a script that generates a summary of the machine hardware and configuration. It's from the "supportutils" package. The command is "supportconfig". Please run it and attach the result.
Comment 26 Rory A 2024-03-17 22:09:38 UTC
Created attachment 873583 [details]
Script results from Live ISO and Leap using 2 different kernels

The attached zip contains a number of files.

Initially I was testing on Leap 15.5 XFCE 'live' ISO.  The rescue system mentioned previously is no longer available, apologies.

* BSC script results for this 'live' ISO on kernel 55.39 : 

2024-03-15.20-47-47 -- Live ISO -- inside XFCE -- 55.39-default.tgz
2024-03-15.20-53-37 -- Live ISO -- kill XFCE and X -- 55.39-default.tgz

I attempted to upgrade this kernel but ran into issues because it is a 'live ISO' system.
Given this I did a fresh install of Leap 15.5 onto a flash drive (this is very slow and I do not recommend it) because I did not want to mess with my hard drive.

I then booted up, ran the BSC script, upgraded the kernel, rebooted, ran the BSC script again.  No errors were encountered.

* Leap full install BSC results for 53-default : 

2024-03-16.20-42-20 -- Leap 15.5 -- inside XFCE -- 150500.53-default.tgz
2024-03-16.20-46-38 -- Leap 15.5 -- kill XFCE and X -- 150500.53-default.tgz

* Leap BSC results with upgraded kernel 55.52-default :

2024-03-17.15-56-10 -- Leap 15.5 -- inside XFCE -- 150500.55.52-default.tgz
2024-03-17.16-36-14 -- Leap 15.5 -- kill XFCE and X -- 150500.55.52-default.tgz

* Results of command supportconfig

scc_localhost.localdomain_240316_2121.txz
Comment 27 Rory A 2024-03-18 02:11:22 UTC
Created attachment 873585 [details]
test results after options i2c-i801 disable_features=0x10

I apologize in advance if this is 'information overload'.

Going off your insights and comment about IRQ I did a search and found a few other people / threads on a similar issue [links at bottom].  

I use this command for a quick look at IRQs and see 'i801_smbus' is off the charts (image : before.png).

watch -n1 -d cat /proc/interrupts

I know this is a very simple version of your detailed scripts and troubleshooting!

I then see someone saying they made this change as a temporary work-around [links at bottom]

make this file : /etc/modprobe.d/i801-fix.conf

add this code then reboot : 

options i2c-i801 disable_features=0x10

After this the cpu mhz seem to be more dynamic and I see them both sometimes falling under 2GHz, which I did not see before.  
CPU is still not reaching idle but doing somewhat better than before.

Attached 'after.png' of the same IRQ output seen in before.png.

Also re-ran your bsc script.

irq-info.zip contents : 

* show IRQ monitor output from the command : watch -n1 -d cat /proc/interrupts

before.png
after.png

* Your bsc script ran on kernel 55.52 inside XFCE

2024-03-17.20-17-27 -- Leap 15.5 -- i801modprobe -- inside XFCE -- 150500.55.52-default.tgz

* Your bsc script ran on kernel 55.52 at tty with XFCE and X killed

2024-03-17.20-30-46 -- Leap 15.5 -- i801modprobe -- kill XFCE and X -- 150500.55.52-default.tgz.tgz

* Your bsc script ran on kernel 53-default inside XFCE

2024-03-17.20-40-55 -- Leap 15.5 -- i801modprobe -- inside XFCE -- 150500.53-default.tgz

* Your bsc script ran on kernel 53-default at tty with XFCE and X killed

2024-03-17.20-45-24 -- Leap 15.5 -- i801modprobe -- kill XFCE and X -- 150500.53-default.tgz

Links

https://www.reddit.com/r/archlinux/comments/omn5yf/high_interrupt_rate_for_i801_smbus_on_atom_c2000/

https://bugzilla.kernel.org/show_bug.cgi?id=177311