Bug 1219427

Summary: Wifi card gone after resume from suspend (Lenovo P14s, ath11k_pci)
Product: [openSUSE] openSUSE Tumbleweed Reporter: Robert Munteanu <rombert>
Component: Kernel:NetworkingAssignee: Kernel Bugs <kernel-bugs>
Status: RESOLVED FIXED QA Contact: E-mail List <qa-bugs>
Severity: Normal    
Priority: P5 - None CC: rombert, tiwai
Version: CurrentFlags: tiwai: needinfo? (rombert)
Target Milestone: ---   
Hardware: x86-64   
OS: openSUSE Tumbleweed   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: Complete kernel log

Description Robert Munteanu 2024-02-01 08:45:16 UTC
Created attachment 872364 [details]
Complete kernel log

I have recently (Week of Jan 29th) lost the ability to use WiFi after resuming from suspend (closing the laptop lid). The problem is fixed by rebooting.

I have tried booting from a previous snapper snapshot using the latest Kernel 6.6.x but that did not solve the problem. A reboot does usually solve the problem, although once I had to explicitly shut down and then power on the laptop.

The specific problem reported in the kernel log seems to be.

[30081.371316]  drm_suballoc_helper sha1_ssse3 xhci_pci xhci_pci_renesas drm_buddy nvme drm_display_helper xhci_hcd nvme_core hid_multitouch aesni_intel ucsi_acpi cec video hid_generic nvme_auth typec_ucsi crypto_simd cryptd usbcore roles ccp rc_core sp5100_tco typec t10_pi battery wmi i2c_hid_acpi i2c_hid serio_raw btrfs blake2b_generic libcrc32c crc32c_intel xor raid6_pq dm_mirror dm_region_hash dm_log dm_mod v4l2loopback(O) videodev mc br_netfilter bridge stp llc msr efivarfs
[30081.371410] CPU: 0 PID: 16782 Comm: kworker/u32:17 Tainted: G           O       6.7.1-2-default #1 openSUSE Tumbleweed d50116cfdb1b14a701e904c894d8f1c040bf1146
[30081.371417] Hardware name: LENOVO 21J6S0H405/21J6S0H405, BIOS R23ET65W (1.35 ) 03/21/2023
[30081.371420] Workqueue: events_unbound async_run_entry_fn
[30081.371427] RIP: 0010:ieee80211_reconfig+0x9f/0x14d0 [mac80211]
[30081.371502] Code: 02 00 00 41 c6 86 85 05 00 00 00 4c 89 f7 e8 68 9d fb ff 41 89 c4 85 c0 0f 84 0d 03 00 00 48 c7 c7 38 d1 36 c2 e8 a1 18 62 e8 <0f> 0b eb 2d 84 c0 0f 85 9d 01 00 00 c6 87 85 05 00 00 00 e8 39 9d
[30081.371506] RSP: 0018:ffffa8a7898e7ca0 EFLAGS: 00010286
[30081.371510] RAX: 0000000000000000 RBX: ffff971297f98538 RCX: 0000000000000027
[30081.371514] RDX: ffff9718a1a27508 RSI: 0000000000000001 RDI: ffff9718a1a27500
[30081.371516] RBP: ffff971297f983c0 R08: 0000000000000000 R09: ffffa8a7898e7c28
[30081.371519] R10: 3fffffffffffffff R11: 00000000000000ac R12: 00000000fffffff5
[30081.371521] R13: 0000000000000000 R14: ffff971297f98900 R15: ffff9712400516b0
[30081.371525] FS:  0000000000000000(0000) GS:ffff9718a1a00000(0000) knlGS:0000000000000000
[30081.371528] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[30081.371531] CR2: 00007ffe5f938536 CR3: 00000005a1236000 CR4: 0000000000750ef0
[30081.371535] PKRU: 55555554
[30081.371537] Call Trace:
[30081.371542]  <TASK>
[30081.371545]  ? ieee80211_reconfig+0x9f/0x14d0 [mac80211 d99114b14b645d6c371b11222312468fe3705e2f]
[30081.371616]  ? __warn+0x81/0x130
[30081.371625]  ? ieee80211_reconfig+0x9f/0x14d0 [mac80211 d99114b14b645d6c371b11222312468fe3705e2f]
[30081.371698]  ? report_bug+0x171/0x1a0
[30081.371704]  ? srso_alias_return_thunk+0x5/0xfbef5
[30081.371710]  ? up+0x16/0x60
[30081.371718]  ? handle_bug+0x3c/0x80
[30081.371725]  ? exc_invalid_op+0x17/0x70
[30081.371730]  ? asm_exc_invalid_op+0x1a/0x20
[30081.371741]  ? ieee80211_reconfig+0x9f/0x14d0 [mac80211 d99114b14b645d6c371b11222312468fe3705e2f]
[30081.371813]  ? srso_alias_return_thunk+0x5/0xfbef5
[30081.371818]  ? schedule+0x32/0xd0
[30081.371826]  ? srso_alias_return_thunk+0x5/0xfbef5
[30081.371831]  ? srso_alias_return_thunk+0x5/0xfbef5
[30081.371836]  ? schedule_timeout+0x147/0x160
[30081.371841]  ? srso_alias_return_thunk+0x5/0xfbef5
[30081.371846]  ? select_task_rq_fair+0x588/0x17d0
[30081.371856]  ? srso_alias_return_thunk+0x5/0xfbef5
[30081.371861]  ? lock_timer_base+0x61/0x80
[30081.371873]  wiphy_resume+0x85/0x1b0 [cfg80211 32b196b4ffb4d979ae9ec0bbd80d413c169c6d66]
[30081.371955]  ? __pfx_wiphy_resume+0x10/0x10 [cfg80211 32b196b4ffb4d979ae9ec0bbd80d413c169c6d66]
[30081.372023]  dpm_run_callback+0x8c/0x1e0
[30081.372031]  device_resume+0x104/0x270
[30081.372039]  ? __pfx_dpm_watchdog_handler+0x10/0x10
[30081.372046]  async_resume+0x1e/0x60
[30081.372053]  async_run_entry_fn+0x32/0x120
[30081.372058]  process_one_work+0x168/0x330
[30081.372068]  worker_thread+0x2f5/0x410
[30081.372074]  ? __pfx_worker_thread+0x10/0x10
[30081.372078]  kthread+0xe8/0x120
[30081.372084]  ? __pfx_kthread+0x10/0x10
[30081.372091]  ret_from_fork+0x34/0x50
[30081.372097]  ? __pfx_kthread+0x10/0x10
[30081.372102]  ret_from_fork_asm+0x1b/0x30
[30081.372114]  </TASK>
[30081.372117] ---[ end trace 0000000000000000 ]---

Laptop model: ThinkPad P14s Gen 3
Wifi card: Qualcomm QCNFA765 Wireless Network Adapter / ath11k_pci
Comment 1 Takashi Iwai 2024-02-02 10:27:36 UTC
Could you check the behavior with the recent upstream kernel in OBS Kernel:stable repo?
  http://download.opensuse.org/repositories/Kernel:/stable/standard/
Comment 2 Robert Munteanu 2024-02-07 15:51:54 UTC
Initial tests look good, I will keep running this kernel for one more day ( 6.7.3-3.g8578156-default ). I don't do it all the time since I use Secure Boot and use a KMP built on OBS which does not load

Feb 05 23:13:18 rombert2306.corp.adobe.com systemd-modules-load[981]: Failed to insert module 'v4l2loopback': Key was rejected by service

It is not very clear to me why this would work, as rolling back to a previous snapshot using the 6.6.x kernel did not fix my problem. Perhaps the firmware package had a regression?

At any rate, I will test a bit more.
Comment 3 Takashi Iwai 2024-02-07 15:57:46 UTC
FWIW, there have been no change for ath11k firmware over a year.
It must be something else.
Comment 4 Robert Munteanu 2024-02-08 08:38:38 UTC
I was too eager, saw the same problem after another suspend with 6.7.3-3.g8578156-default.

I am not sure if there is a chance for this to happen on every susped/resume operation or if the fact that I suspended the laptop for a longer time ( ~ 1h 35 minutes ) was the deciding factor.
Comment 5 Takashi Iwai 2024-02-12 12:02:14 UTC
OK, then could you try to report to the upstream devs?  At best linux-wireless ML.  Feel free to put me (tiwai@suse.de) to Cc.
Comment 6 Takashi Iwai 2024-03-11 16:07:17 UTC
TW is moving the kernel from 6.7.x to 6.8.  It'd be worth to check the behavior with 6.8.0 kernel in OBS Kernel:stable, too.
Comment 7 Robert Munteanu 2024-03-14 09:52:35 UTC
I think this might be fixed already in the 6.7 versions, but I would like to test a bit more.

My latest test was to suspend the laptop for about one hour, and then resume. I think suspend completed successfully

Mar 14 08:09:58 rombert2306.corp.adobe.com systemd-sleep[1804]: Entering sleep state 'suspend'...
Mar 14 08:09:58 rombert2306.corp.adobe.com kernel: PM: suspend entry (s2idle)
Mar 14 08:09:58 rombert2306.corp.adobe.com dns-dnsmasq.sh[1850]: <debug> NETWORKMANAGER_DNS_FORWARDER is not set to "dnsmasq" in /etc/sysconfig/network/config -> exit

And then the wireless card worked as expected after resume.
Comment 8 Robert Munteanu 2024-03-25 13:04:29 UTC
I haven't seen the problem for some time, this was very likely fixed with one of the later 6.7.x kernel releases.

Closing as fixed.