Bug 1179869 - kernel oops on Intel USB bluetooth embedded on AC3165 (btusb), kernel 5.9.12. busy workqueues in 5.10.1
kernel oops on Intel USB bluetooth embedded on AC3165 (btusb), kernel 5.9.12....
Status: NEW
Classification: openSUSE
Product: openSUSE Tumbleweed
Classification: openSUSE
Component: Kernel
Current
x86-64 openSUSE Tumbleweed
: P5 - None : Normal (vote)
: ---
Assigned To: openSUSE Kernel Bugs
E-mail List
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2020-12-10 00:31 UTC by fcicq
Modified: 2022-01-28 11:27 UTC (History)
3 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---
tiwai: needinfo? (spam2)


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description fcicq 2020-12-10 00:31:34 UTC
the bluetooth mouse suddenly stopped working. then I suspend the machine and wake it up so it could restore.


env:
Linux version 5.9.12-1-default (geeko@buildhost) (gcc (SUSE Linux) 10.2.1 20201117 [revision 98ba03ffe0b9f37b4916ce6238fad754e00d720b], GNU ld (GNU Binutils; openSUSE Tumbleweed) 2.35.1.20201112-1) #1 SMP Wed Dec 2 09:24:35 UTC 2020 (46922da)

Bluetooth is embedded with AC3165
[   10.291635] iwlwifi 0000:01:00.0: Detected Intel(R) Dual Band Wireless AC 3165, REV=0x210

lsusb:
Bus 001 Device 003: ID 8087:0a2a Intel Corp. Bluetooth wireless interface

related logs:

(before suspend)
[94758.717886] PM: suspend entry (deep)
[94758.945038] Filesystems sync: 0.227 seconds
[94761.034902] Bluetooth: hci0: Timed out waiting for suspend events
[94761.034916] Bluetooth: hci0: Suspend timeout bit: 6
[94761.034921] Bluetooth: hci0: Suspend notifier action (3) failed: -110
[94761.034929] Freezing user space processes ... (elapsed 0.002 seconds) done.
[94761.037660] OOM killer disabled.


(after wake up)
[94762.114789] Restarting tasks ... done.
[94762.129754] Bluetooth: hci0: read Intel version: 370810011003110e00
[94762.129764] Bluetooth: hci0: Intel Bluetooth firmware file: intel/ibt-hw-37.8.10-fw-1.10.3.11.e.bseq
[94762.183961] ------------[ cut here ]------------
[94762.183975] WARNING: CPU: 3 PID: 647 at kernel/workqueue.c:1413 __queue_work+0x2ec/0x370
[94762.183976] Modules linked in: sctp uhid rfcomm ccm ip6t_REJECT nf_reject_ipv6 ip6t_rpfilter xt_tcpudp ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ebtable_broute ip6table_nat zram ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c af_packet iptable_mangle iptable_raw iptable_security ip_set nfnetlink ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables bpfilter intel_pmc_bxt intel_telemetry_pltdrv intel_punit_ipc intel_telemetry_core cmac algif_hash x86_pkg_temp_thermal algif_skcipher intel_powerclamp af_alg coretemp bnep kvm_intel snd_sof_pci snd_sof_intel_byt snd_sof_intel_ipc snd_sof_intel_hda_common snd_sof_xtensa_dsp iwlmvm kvm snd_sof_intel_hda snd_sof spi_pxa2xx_platform dw_dmac snd_soc_skl mei_hdcp snd_soc_hdac_hda snd_hda_ext_core intel_rapl_msr snd_soc_sst_ipc snd_hda_codec_hdmi snd_soc_sst_dsp snd_soc_acpi_intel_match mac80211 snd_soc_acpi snd_soc_core snd_hda_codec_realtek
[94762.184018]  snd_compress irqbypass snd_pcm_dmaengine dmi_sysfs efi_pstore snd_hda_codec_generic pcspkr wdat_wdt libarc4 ledtrig_audio i2c_i801 msr i2c_smbus lpc_ich snd_hda_intel snd_intel_dspcfg snd_hda_codec iwlwifi btusb btrtl snd_hda_core btbcm btintel joydev snd_hwdep bluetooth cfg80211 mei_me snd_pcm intel_xhci_usb_role_switch ecdh_generic roles mei kxcjk_1013 snd_timer ecc industrialio_triggered_buffer kfifo_buf intel_lpss_pci rfkill intel_lpss idma64 industrialio goodix snd soundcore ac processor_thermal_device intel_hid int3400_thermal intel_rapl_common int3403_thermal intel_soc_dts_iosf int340x_thermal_zone acpi_thermal_rel thermal tiny_power_button int3406_thermal sparse_keymap fan button dptf_power nls_iso8859_1 nls_cp437 vfat fat configfs fuse hid_generic usbhid f2fs i915 mmc_block i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec crct10dif_pclmul crc32_pclmul crc32c_intel rc_core ghash_clmulni_intel drm aesni_intel xhci_pci xhci_pci_renesas
[94762.184067]  glue_helper crypto_simd xhci_hcd cryptd sdhci_pci cqhci sdhci usbcore serio_raw mmc_core battery video pinctrl_broxton pinctrl_intel sg br_netfilter bridge stp llc dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua efivarfs
[94762.184083] CPU: 3 PID: 647 Comm: bluetoothd Not tainted 5.9.12-1-default #1 openSUSE Tumbleweed
[94762.184084] Hardware name: Chuwi Co. Ltd. Hi13/AP1, BIOS 1.16 05/16/2017
[94762.184088] RIP: 0010:__queue_work+0x2ec/0x370
[94762.184091] Code: e9 f1 fd ff ff 49 89 c7 e9 a6 fe ff ff 65 8b 05 3a 55 d6 53 a9 00 01 ff 00 75 0f 65 48 8b 3c 25 c0 9b 01 00 f6 47 24 20 75 25 <0f> 0b 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 0f 0b e9 dc fe
[94762.184093] RSP: 0018:ffffad5c004a3da8 EFLAGS: 00010046
[94762.184095] RAX: 0000000080000000 RBX: 0000000000000286 RCX: 0000000000000000
[94762.184096] RDX: ffff8f9e168e88f0 RSI: ffff8f9f79ed4c00 RDI: ffff8f9f7a0d5c40
[94762.184097] RBP: ffff8f9e168e88f0 R08: ffff8f9e168e88f8 R09: ffff8f9e168e8910
[94762.184098] R10: 0000000000000008 R11: ffff8f9e1921d910 R12: ffff8f9f79ed4c00
[94762.184099] R13: 0000000000000200 R14: 0000000000000000 R15: ffff8f9e44acb2f8
[94762.184101] FS:  00007f8e198227c0(0000) GS:ffff8f9f7c580000(0000) knlGS:0000000000000000
[94762.184103] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[94762.184104] CR2: 00003b633d3fe170 CR3: 000000016b0c2000 CR4: 00000000003506e0
[94762.184105] Call Trace:
[94762.184115]  ? try_to_grab_pending+0xcf/0x170
[94762.184117]  queue_delayed_work_on+0x36/0x40
[94762.184152]  l2cap_chan_del+0x221/0x2c0 [bluetooth]
[94762.184179]  l2cap_sock_shutdown+0x110/0x330 [bluetooth]
[94762.184205]  l2cap_sock_release+0x49/0xe0 [bluetooth]
[94762.184210]  __sock_release+0x3d/0xa0
[94762.184213]  sock_close+0x11/0x20
[94762.184216]  __fput+0x9d/0x240
[94762.184220]  task_work_run+0x65/0xa0
[94762.184224]  exit_to_user_mode_loop+0xf8/0x120
[94762.184227]  exit_to_user_mode_prepare+0x8f/0xb0
[94762.184231]  syscall_exit_to_user_mode+0x18/0x40
[94762.184234]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[94762.184237] RIP: 0033:0x7f8e19d76a67
[94762.184240] Code: ff e8 2d e8 01 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 41 c3 48 83 ec 18 89 7c 24 0c e8 f3 37 f9 ff
[94762.184241] RSP: 002b:00007ffe4383f708 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
[94762.184243] RAX: 0000000000000000 RBX: 0000559e264d8620 RCX: 00007f8e19d76a67
[94762.184244] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000000002a
[94762.184245] RBP: 0000000000000000 R08: 0000000000000002 R09: 0000000000000005
[94762.184246] R10: 0000000000000030 R11: 0000000000000246 R12: 0000000000000001
[94762.184247] R13: 0000000000000001 R14: 0000559e264b7ab0 R15: 0000559e264b7b30
[94762.184250] ---[ end trace 2991823e3ce9bec1 ]---
[94762.224382] mei_hdcp 0000:00:0f.0-b638ab7e-94e2-4ea2-a552-d1c54b627f04: cannot connect
[94762.224391] mei_hdcp 0000:00:0f.0-b638ab7e-94e2-4ea2-a552-d1c54b627f04: mei_cldev_enable Failed. -16
[94762.224403] mei_hdcp: probe of 0000:00:0f.0-b638ab7e-94e2-4ea2-a552-d1c54b627f04 failed with error -16
[94762.268177] PM: suspend exit
[94762.436682] Bluetooth: hci0: unexpected event for opcode 0xfc2f
[94762.454732] Bluetooth: hci0: Intel BT fw patch 0x32 completed & activated

(and it returned normal)
Comment 1 Takashi Iwai 2020-12-10 13:28:07 UTC
Is this issue reproducible?  Also, is it a regression from the earlier releases?

Note that the kernel WARNING appears likely only once per boot, so you'd need to reboot to check whether it can appear again.
Comment 2 fcicq 2020-12-10 14:30:26 UTC
reproducible: USB device descriptor related errors are more frequent (need hours?) with older kernel but I'm happy with 5.9 (at the moment none?), but that is uninteresting and no way to fix (I dont think I can easily power cycle the internal usb port). so I learned the suspend-wake cycle could help to restore the functionality and not forced to reboot. so this question is once.

regression: not sure.

---

maybe it is an uninteresting bug. oops can be resolved but hw failure still needs some power cycle.

my guess:
some hw failure
-> bluetoothd stuck (but there is no related log!) and closing?
-> suspend/wake
-> oops on resumed cleanup? or any remaining work?
Comment 3 fcicq 2020-12-26 04:51:31 UTC
now upgraded to 5.10.1-1-default. no longer oops but it says busy workqueues.


[17265.144211] Bluetooth: hci0: command 0x0406 tx timeout
[17268.536609] usb 1-8: reset full-speed USB device number 3 using xhci_hcd
[17270.168081] Bluetooth: hci0: command 0x0c03 tx timeout
[17278.391900] Bluetooth: hci0: HCI reset during shutdown failed
[17284.151840] usb 1-8: device descriptor read/64, error -110
[17296.370979] wlp1s0: deauthenticating from MASKED by local choice (Reason: 3=DEAUTH_LEAVING)
[17297.094601] PM: suspend entry (deep)
[17297.200514] Filesystems sync: 0.105 seconds
[17297.239967] Freezing user space processes ... (elapsed 0.002 seconds) done.
[17297.242420] OOM killer disabled.
[17297.242421] Freezing remaining freezable tasks ... 
[17299.767783] usb 1-8: device descriptor read/64, error -110
[17300.003569] usb 1-8: reset full-speed USB device number 3 using xhci_hcd
[17315.383370] usb 1-8: device descriptor read/64, error -110

[17317.248611] Freezing of tasks failed after 20.006 seconds (0 tasks refusing to freeze, wq_busy=1):
[17317.248621] Showing busy workqueues and worker pools:
[17317.248625] workqueue events: flags=0x0
[17317.248628]   pwq 6: cpus=3 node=0 flags=0x0 nice=0 active=1/256 refcnt=3
[17317.248647]     in-flight: 4410:btusb_waker [btusb] BAR(3433)
[17317.248675] workqueue pm: flags=0x4
[17317.248678]   pwq 6: cpus=3 node=0 flags=0x0 nice=0 active=0/0 refcnt=2
[17317.248690]     delayed: pm_runtime_work
[17317.248703]   pwq 4: cpus=2 node=0 flags=0x0 nice=0 active=0/0 refcnt=3
[17317.248715]     delayed: pm_runtime_work, pm_runtime_work
[17317.248728]   pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=0/0 refcnt=2
[17317.248740]     delayed: pm_runtime_work
[17317.248758] workqueue usb_hub_wq: flags=0x4
[17317.248761]   pwq 4: cpus=2 node=0 flags=0x0 nice=0 active=1/0 refcnt=2
[17317.248773]     in-flight: 103:hub_event [usbcore]
[17317.248826] workqueue hci0: flags=0xa0012
[17317.248830]   pwq 9: cpus=0-3 flags=0x4 nice=-20 active=1/1 refcnt=4
[17317.248841]     in-flight: 3433:hci_power_off [bluetooth]
[17317.248902]     delayed: hci_power_on [bluetooth]
[17317.248959] pool 4: cpus=2 node=0 flags=0x0 nice=0 hung=3s workers=3 idle: 4497 3809
[17317.248978] pool 6: cpus=3 node=0 flags=0x0 nice=0 hung=0s workers=5 idle: 4537 4500 4439 4451
[17317.249002] pool 9: cpus=0-3 flags=0x4 nice=-20 hung=20s workers=3 idle: 4498 124

[17317.249186] Restarting kernel threads ... done.
[17317.249737] OOM killer enabled.
[17317.249739] Restarting tasks ... done.
[17317.431005] PM: suspend exit
[17317.431141] PM: suspend entry (s2idle)
[17317.989563] Filesystems sync: 0.558 seconds
[17317.990021] Freezing user space processes ... (elapsed 0.002 seconds) done.
[17317.992300] OOM killer disabled.
[17317.992302] Freezing remaining freezable tasks ...
Comment 4 Miroslav Beneš 2022-01-28 11:27:02 UTC
It has been a while, sorry about that. fcicq, is the issue still present with the latest TW kernel (5.16 as of now)?