|
Bugzilla – Full Text Bug Listing |
| Summary: | System hangs on stop job for irqbalance after Upgrade from 15.4 to 15.5 | ||
|---|---|---|---|
| Product: | [openSUSE] openSUSE Distribution | Reporter: | Rainer Kaluscha <rainer.kaluscha> |
| Component: | Upgrade Problems | Assignee: | Thomas Renninger <trenn> |
| Status: | RESOLVED INVALID | QA Contact: | Jiri Srain <jsrain> |
| Severity: | Minor | ||
| Priority: | P5 - None | CC: | jcheung, rainer.kaluscha |
| Version: | Leap 15.5 | Flags: | aschnell:
needinfo?
(rainer.kaluscha) |
| Target Milestone: | --- | ||
| Hardware: | x86-64 | ||
| OS: | openSUSE Leap 15.5 | ||
| Whiteboard: | |||
| Found By: | --- | Services Priority: | |
| Business Priority: | Blocker: | --- | |
| Marketing QA Status: | --- | IT Deployment: | --- |
| Attachments: |
strace-log for irqbalance 1.9.2 after systemctl stop - takes 30 sec
strace-log for irqbalance 1.9.3 after systemctl stop - no delay |
||
|
Description
Rainer Kaluscha
2023-08-16 06:56:05 UTC
Can you try a build from here please: https://build.opensuse.org/package/show/home:trenn/irqbalance.SUSE_SLE-15-SP5_Update FYI: I'll be on vacation from tomorrow on. Have a nice vacation. Unfortunately, irqbalance-1.9.2-150500.1.1.x86_64.rpm from your repo also hangs on systemctl stop irqbalance.service :-( Going back to 1.91 ... So long, Rainer And irqbalance-1.9.2-150500.1.2 hangs, too ... If you want you can try the latest version from https://build.opensuse.org/package/show/home:aschnell:branches:Base:System/irqbalance. And please if you give version numbers include the complete version, e.g. on 15.4 we have 1.8.0.18.git+2435e8d. Can you try to attach strace to the irqbalance process and then use 'systemctl stop'? E.g. 'strace -o irqbalance.log -tt -p <pid> /usr/sbin/irqbalance' where you get the pid from 'systemctl status irqbalance.service'? irqbalance-1.9.3.10.git+1a7d461-150500.260.1.x86_64.rpm works on my primary Linux box - no hang / delay when stopping the service. Great, thnx ! P.S. I will test also on my laptop in the next days ... OK, that is good to hear. Unfortunately we still do not know what actually caused the hang. I cannot say whether we can make an update for 15.5 based on that. Fortunately, the error is reproducible. If it helps, I can still do an strace on a buggy version, e.g. the current version from OSS 15.5 main repo (1.9.2-150500.1.3). Created attachment 871348 [details]
strace-log for irqbalance 1.9.2 after systemctl stop - takes 30 sec
Created attachment 871349 [details]
strace-log for irqbalance 1.9.3 after systemctl stop - no delay
irqbalance-1.9.3.10.git+1a7d461-150500.260.1.x86_64.rpm
Comment on attachment 871348 [details]
strace-log for irqbalance 1.9.2 after systemctl stop - takes 30 sec
irqbalance-1.9.2-150500.1.3.x86_64.rpm
Thanks for the logs. AFAIS irqbalance is stuck in recvmsg() which is used when communicating with irqbalance-ui. Are you using irqbalance-ui or any other UI connected to irqbalance? "systemctl stop irqbalance.service" hangs with irqbalance-1.9.2-150500.1.3 even if irqbalance-ui isn't installed. The service comes up without errors, it just warns about "thermal: received a netlink error (Interrupted system call)". P.S. Did I mention that I disabled IP V6 on my box (using sysctl: net.ipv6.conf.all.disable_ipv6 = 1) ? Looks to be related to https://github.com/Irqbalance/irqbalance/issues/259. Unfortunately the fix is spread over several commits. Still I wonder why I cannot reproduce it. Do you have the standard SUSE kernel (with CONFIG_THERMAL_NETLINK=y)? Bingo :-) Enabling CONFIG_THERMAL_NETLINK=y resolved the issue ... Startup message changed from "thermal: received a netlink error" to " thermal: received group id (3)" and service stops without delay when asked to do so. So in the initial report you wrote that you tested with different kernels including the "distro" kernel. But apparently that did not mean the 15.5 kernel RPM which has CONFIG_THERMAL_NETLINK=y. Looks invalid to me. |