Bug 1218455 - kernel: NOHZ tick-stop error: local softirq work is pending
Summary: kernel: NOHZ tick-stop error: local softirq work is pending
Status: NEW
Alias: None
Product: openSUSE Tumbleweed
Classification: openSUSE
Component: Kernel (show other bugs)
Version: Current
Hardware: x86-64 openSUSE Tumbleweed
: P5 - None : Major (vote)
Target Milestone: ---
Assignee: Frederic Weisbecker
QA Contact: E-mail List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-12-29 16:50 UTC by Michael Hirmke
Modified: 2024-07-01 18:32 UTC (History)
2 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Hirmke 2023-12-29 16:50:57 UTC
Starting with kernel 6.6.6 (and continuing with 6.6.7) I get

kernel: NOHZ tick-stop error: local softirq work is pending, handler #08!!!

a few times within one or two minutes and during this time performance is degraded heavily.
This happens about every two days, so it is not that much of a problem, but a bit annoying.

system is:

Operating System: openSUSE Tumbleweed 20231226
KDE Plasma Version: 5.27.10
KDE Frameworks Version: 5.113.0
Qt Version: 5.15.11
Kernel Version: 6.6.7-1-default (64-bit)
Graphics Platform: X11
Processors: 8 × 11th Gen Intel® Core™ i7-1165G7 @ 2.80GHz
Memory: 15,0 GiB of RAM
Graphics Processor: Mesa Intel® Xe Graphics
Manufacturer: Dell Inc.
Product Name: XPS 13 9310 2-in-1
Comment 1 Takashi Iwai 2024-01-08 15:24:39 UTC
Frederic, could you take a look?
Comment 2 Michael Hirmke 2024-02-14 17:00:23 UTC
No one? Problem still exists - happens only about once a week now, but then it happens about 10 times within 15 minutes.
Comment 3 Frederic Weisbecker 2024-02-18 23:21:29 UTC
Handler 8 is NET_RX_SOFTIRQ. This means that CPU is going idle with a NET_RX_SOFTIRQ softirq pending. This is not supposed to happen in an online CPU because at this stage if a softirq is pending, ksoftirqd should be pending, thus need_resched() should be set, and the tick shouldn't be stoppeable.

One possibility though is that the CPU is going offline after the smpboot kthreads (including ksoftirqd) haven been parked (past CPUHP_AP_SMPBOOT_THREADS). In that case need_resched() may not be set even if softirqs are pending. But isn't networking supposed to deal with that somewhow?
Comment 4 Michael Hirmke 2024-02-19 10:44:06 UTC
Sry, this is nothing I have enough knowlegde to answer.
Comment 5 Michael Hirmke 2024-06-02 12:20:15 UTC
Problem still exists with kernel 6.9.1-1-default.
Comment 6 Michael Hirmke 2024-07-01 14:48:40 UTC
Problem is getting worse - now I get this about 50 times a day and my network conenctions are dying.
Comment 7 Michael Hirmke 2024-07-01 14:49:09 UTC
Operating System: openSUSE Tumbleweed 20240629
KDE Plasma Version: 6.1.1
KDE Frameworks Version: 6.3.0
Qt Version: 6.7.2
Kernel Version: 6.9.7-1-default (64-bit)
Graphics Platform: X11
Processors: 8 × 11th Gen Intel® Core™ i7-1165G7 @ 2.80GHz
Memory: 15,0 GiB of RAM
Graphics Processor: Mesa Intel® Xe Graphics
Manufacturer: Dell Inc.
Product Name: XPS 13 9310 2-in-1
Comment 8 Michael Hirmke 2024-07-01 18:32:29 UTC
Jul 01 11:11:44 client kernel: NOHZ tick-stop error: local softirq work is pending, handler #08!!!
Jul 01 11:13:03 client kernel: NOHZ tick-stop error: local softirq work is pending, handler #08!!!
Jul 01 11:13:17 client kernel: NOHZ tick-stop error: local softirq work is pending, handler #08!!!
Jul 01 11:29:35 client kernel: NOHZ tick-stop error: local softirq work is pending, handler #08!!!
Jul 01 11:31:15 client kernel: NOHZ tick-stop error: local softirq work is pending, handler #08!!!
Jul 01 11:45:25 client kernel: NOHZ tick-stop error: local softirq work is pending, handler #08!!!
Jul 01 11:45:38 client kernel: NOHZ tick-stop error: local softirq work is pending, handler #08!!!
Jul 01 11:49:30 client kernel: NOHZ tick-stop error: local softirq work is pending, handler #08!!!
Jul 01 11:49:33 client kernel: NOHZ tick-stop error: local softirq work is pending, handler #08!!!
Jul 01 11:49:58 client kernel: NOHZ tick-stop error: local softirq work is pending, handler #08!!!
Jul 01 16:19:57 client kernel: NOHZ tick-stop error: local softirq work is pending, handler #08!!!
Jul 01 16:20:19 client kernel: NOHZ tick-stop error: local softirq work is pending, handler #08!!!
Jul 01 16:22:01 client kernel: NOHZ tick-stop error: local softirq work is pending, handler #08!!!
Jul 01 16:22:17 client kernel: NOHZ tick-stop error: local softirq work is pending, handler #08!!!
Jul 01 16:22:36 client kernel: NOHZ tick-stop error: local softirq work is pending, handler #08!!!
Jul 01 16:25:22 client kernel: NOHZ tick-stop error: local softirq work is pending, handler #08!!!
Jul 01 16:28:06 client kernel: NOHZ tick-stop error: local softirq work is pending, handler #08!!!
Jul 01 16:28:43 client kernel: NOHZ tick-stop error: local softirq work is pending, handler #08!!!
Jul 01 16:31:22 client kernel: NOHZ tick-stop error: local softirq work is pending, handler #08!!!
Jul 01 16:42:22 client kernel: NOHZ tick-stop error: local softirq work is pending, handler #08!!!
Jul 01 16:42:36 client kernel: NOHZ tick-stop error: local softirq work is pending, handler #08!!!
Jul 01 16:43:24 client kernel: NOHZ tick-stop error: local softirq work is pending, handler #08!!!
Jul 01 16:43:41 client kernel: NOHZ tick-stop error: local softirq work is pending, handler #08!!!
Jul 01 16:44:59 client kernel: NOHZ tick-stop error: local softirq work is pending, handler #08!!!
Jul 01 16:45:19 client kernel: NOHZ tick-stop error: local softirq work is pending, handler #08!!!
Jul 01 16:46:07 client kernel: NOHZ tick-stop error: local softirq work is pending, handler #08!!!
Jul 01 16:47:39 client kernel: NOHZ tick-stop error: local softirq work is pending, handler #08!!!
Jul 01 16:48:21 client kernel: NOHZ tick-stop error: local softirq work is pending, handler #08!!!
Jul 01 16:48:38 client kernel: NOHZ tick-stop error: local softirq work is pending, handler #08!!!
Jul 01 20:21:03 client kernel: NOHZ tick-stop error: local softirq work is pending, handler #08!!!
Jul 01 20:22:08 client kernel: NOHZ tick-stop error: local softirq work is pending, handler #08!!!
Jul 01 20:23:26 client kernel: NOHZ tick-stop error: local softirq work is pending, handler #08!!!
Jul 01 20:23:36 client kernel: NOHZ tick-stop error: local softirq work is pending, handler #08!!!
Jul 01 20:25:47 client kernel: NOHZ tick-stop error: local softirq work is pending, handler #08!!!
Jul 01 20:26:07 client kernel: NOHZ tick-stop error: local softirq work is pending, handler #08!!!
Jul 01 20:28:27 client kernel: NOHZ tick-stop error: local softirq work is pending, handler #08!!!
Jul 01 20:28:39 client kernel: NOHZ tick-stop error: local softirq work is pending, handler #08!!!

Everytime this happens, the system freezes and the network ist not reachable.
It gets impossible to work 8-<