|
Bugzilla – Full Text Bug Listing |
| Summary: | [SD-149531] After updating openSUSE to 15.5 Thunderbolt Dock 4 isn't working anymore | ||
|---|---|---|---|
| Product: | [openSUSE] PUBLIC SUSE Linux Enterprise Desktop 15 SP5 | Reporter: | ralph roth <ralph.roth> |
| Component: | Kernel | Assignee: | Kernel Bugs <kernel-bugs> |
| Status: | NEW --- | QA Contact: | |
| Severity: | Major | ||
| Priority: | P3 - Medium | CC: | oneukum, ralph.roth, shawn.lee, shung-hsi.yu, tiwai |
| Version: | unspecified | ||
| Target Milestone: | --- | ||
| Hardware: | x86-64 | ||
| OS: | openSUSE Leap 15.5 | ||
| Whiteboard: | |||
| Found By: | --- | Services Priority: | |
| Business Priority: | Blocker: | --- | |
| Marketing QA Status: | --- | IT Deployment: | --- |
| Bug Depends on: | |||
| Bug Blocks: | 1222236 | ||
|
Description
ralph roth
2024-03-27 15:49:40 UTC
Just for the record: I meanwhile cold installed openSUSE 15.5 on this machine, problems are exactly the same. dmesg is from this new installation Old 15.4 Kernel with openSUSE 15.5 worked, but I meanwhile lost this kernel Please verify whether it's a regression in the recent SP5 kernel updates. That is, try to downgrade to the older SP5 kernels, and confirm that the problem persists. You can start from SP5 GA kernel, for example. Judging from the attached log, the igc device got probed, but it was detached later, spewing kernel warnings: [ 2526.650192] igc 0000:6d:00.0 eth0: PCIe link lost, device now detached [ 2526.650215] ------------[ cut here ]------------ [ 2526.650216] igc: Failed to read reg 0xc030! [ 2526.650225] WARNING: CPU: 2 PID: 2974 at ../drivers/net/ethernet/intel/igc/igc_main.c:6470 igc_rd32+0x94/0xa0 [igc] So it's likely an issue in PCIe core in Thunderbolt, I suppose. (In reply to ralph roth from comment #1) > See also https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1942999 Do you see this problem? That is, an invalid MAC address? The patch from Ubuntu wasn't taken to the upsteram in the end, AFAIK. (In reply to ralph roth from comment #3) > Just for the record: I meanwhile cold installed openSUSE 15.5 on this > machine, problems are exactly the same. Leap 15.5 and SLE15-SP5 use the very same binaries, so no wonder :) > Old 15.4 Kernel with openSUSE 15.5 worked, but I meanwhile lost this kernel You can get the one in OBS, e.g. http://download.opensuse.org/update/leap/15.4/sle/x86_64/ (In reply to Takashi Iwai from comment #4) > Please verify whether it's a regression in the recent SP5 kernel updates. > That is, try to downgrade to the older SP5 kernels, and confirm that the > problem persists. You can start from SP5 GA kernel, for example. zypper install --oldpackage kernel-default-5.14.21-150500.53.2.x86_64 The following 2 NEW packages are going to be installed: dracut-mkinitrd-deprecated kernel-default-5.14.21-150500.53.2 (In reply to ralph roth from comment #6) > (In reply to Takashi Iwai from comment #4) > > Please verify whether it's a regression in the recent SP5 kernel updates. > > That is, try to downgrade to the older SP5 kernels, and confirm that the > > problem persists. You can start from SP5 GA kernel, for example. > > zypper install --oldpackage kernel-default-5.14.21-150500.53.2.x86_64 > > The following 2 NEW packages are going to be installed: > dracut-mkinitrd-deprecated kernel-default-5.14.21-150500.53.2 NO, same errors with: Linux p16s23 5.14.21-150500.53-default #1 SMP PREEMPT_DYNAMIC Wed May 10 07:56:26 UTC 2023 (b630043) x86_64 x86_64 x86_64 GNU/Linux (In reply to Takashi Iwai from comment #4) > (In reply to ralph roth from comment #1) > > See also https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1942999 > > Do you see this problem? That is, an invalid MAC address? I tried to help Ralph a bit (off Bugzilla), and the "invalid MAC address" is seem when the machine is booted with kernel config `iommu=off` as a suggested blink try (after trying `bolt authorize ...` to authorize the dock, which also didn't help). OK, then it's a regression that has been present from SLE15-SP5 GA kernel. The next question would be to check with a newer releases, e.g. SLE15-SP6 kernel. Could you check the one in OBS Kernel:SLE15-SP6 repo? http://download.opensuse.org/repositories/Kernel:/SLE15-SP6/pool/ Also, the recent 6.8.x kernel from OBS Kernel:stable:Backport, too: http://download.opensuse.org/repositories/Kernel:/stable:/Backport/standard/ (In reply to Shung-Hsi Yu from comment #8) > (In reply to Takashi Iwai from comment #4) > > (In reply to ralph roth from comment #1) > > > See also https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1942999 > > > > Do you see this problem? That is, an invalid MAC address? > > I tried to help Ralph a bit (off Bugzilla), and the "invalid MAC address" is > seem when the machine is booted with kernel config `iommu=off` as a > suggested blink try (after trying `bolt authorize ...` to authorize the > dock, which also didn't help). Hm, in such an unusual situation, some workaround might be still needed. FWIW, the patch submitted to the upstream was https://patchwork.kernel.org/project/netdevbpf/patch/20210702045120.22855-2-aaron.ma@canonical.com/#24308349 that just adds a delay of 600ms. We can give it a try, too. Good to know that SP6 kernel works. So it's a regression specific to SP5. The problem about the second monitor can be rather an issue of amdgpu driver. OTOH, the detached Ethernet device can be likely a PCIe or Thunderbolt problem. Adding Oliver to Cc. Skimming over the net his the following: https://github.com/fwupd/firmware-lenovo/issues/191 mentioning that this problem can be worked around by a BIOS setup change. Try to go BIOS setup menu, and change Bios -> Config -> Thunderbolt 4 -> PCIe Tunneling to OFF, reboot and retest. For the monitor problem, please open another bugzilla entry. It's a different from the Ethernet stuff. The workaround via BIOS setup might be effective for the monitor, too. Please check it. Workaround fixes the Ethernet NIC problem with 15.5 and 15.6 Kernel. Monitor still *not* working at all (DP and/or HDMI cable). As this is a workaround eliminating the root cause would be nice.... Please open another bug report for the graphics problem. Also test with 6.8.x kernel from OBS Kernel:stable:Backport repo. If the problem persists, test with 6.9-rc kernel from OBS Kernel:HEAD:Backport repo, too. Please provide the URLs for zypper ar -f Backport kernel Mid-March 24 didn't work (In reply to Takashi Iwai from comment #18) > Please open another bug report for the graphics problem. Bug 1222236 - Monitor BSC - [SD-149531] After updating openSUSE to 15.5 Thunderbolt Dock 4 isn't working anymore (In reply to ralph roth from comment #20) > Please provide the URLs for zypper ar -f It's deduced to http://donwload.openuse.org/repositories/...., separating with each colon. OBS Kernel:stable:Backport http://download.opensuse.org/repositories/Kernel:/stable:/Backport/standard/ OBS Kernel:HEAD:Backport http://download.opensuse.org/repositories/Kernel:/HEAD:/Backport/standard/ And, you don't need to add repo at each time. Just download kernel-default.rpm from the URL and install it directly, too. > Backport kernel Mid-March 24 didn't work Which one...? Please elaborate. (In reply to Takashi Iwai from comment #19) > Also test with 6.8.x kernel from OBS Kernel:stable:Backport repo. $ uname -a ; cat /proc/cmdline Linux p16s23 6.8.2-lp155.3.g2daf2c2-default #1 SMP PREEMPT_DYNAMIC Thu Mar 28 07:04:20 UTC 2024 (2daf2c2) x86_64 x86_64 x86_64 GNU/Linux BOOT_IMAGE=/boot/vmlinuz-6.8.2-lp155.3.g2daf2c2-default root=/dev/mapper/system-root preempt=full quiet security=apparmor drm.debug=0x1e log_buf_len=16M mitigations=auto eth: OK Monitor: Failed For this bug report, drop drm.debug option. It'll give a lot of noises that are irrelevant from the Ethernet driver problem. For another bug report (bsc#1222236), please upload the dmesg output with the drm.debug=0x1e boot option instead. ... and please check 6.9-rc kernel, too. If the problem persists there, it's basically an upstream problem and the upstream devs should be involved. (In reply to Takashi Iwai from comment #26) > ... and please check 6.9-rc kernel, too. If the problem persists there, > it's basically an upstream problem and the upstream devs should be involved. $ uname -a; cat /proc/cmdline Linux p16s23 6.9.0-rc2-lp155.2.g0788112-default #1 SMP PREEMPT_DYNAMIC Sun Mar 31 23:08:51 UTC 2024 (0788112) x86_64 x86_64 x86_64 GNU/Linux BOOT_IMAGE=/boot/vmlinuz-6.9.0-rc2-lp155.2.g0788112-default root=/dev/mapper/system-root preempt=full quiet security=apparmor drm.debug=0x1e log_buf_len=16M mitigations=auto eth: OK 2nd Monitor: Failed Just to be sure: eth0 test is with the BIOS setup workaround? Or does 6.8.x / 6.9-rc kernels pass even after reverting the BIOS setup? (In reply to Takashi Iwai from comment #28) > Just to be sure: eth0 test is with the BIOS setup workaround? Or does 6.8.x > / 6.9-rc kernels pass even after reverting the BIOS setup? ethX: Currently with the BIOS workaround. If needed I can check that tomorrow Yes, please test without BIOS workaround. If the problem persists there, it might be worth to report to the upstream, too. OTOH, if the problem is fixed in the upstream, we can look for some materials to backport to SLE15-SP6 kernel. (In reply to Takashi Iwai from comment #30) > Yes, please test without BIOS workaround. If the problem persists there, it > might be worth to report to the upstream, too. OTOH, if the problem is > fixed in the upstream, we can look for some materials to backport to > SLE15-SP6 kernel. Without the BIOS workaround the NIC won't work. Tested with 15.4 L&G Kernel and 6.9.0rc2 Kernel Also: 15.4 with L&G Kernel didn't work anymore after the cold install of openSUSE Leap 15.5 :-( Linux p16s23 5.14.21-150400.22-default #1 SMP PREEMPT_DYNAMIC Wed May 11 06:57:18 UTC 2022 (49db222) x86_64 x86_64 x86_64 GNU/Linux Have you upgraded BIOS on Dock, or applied anything since Leap 15.4? Just wondering what triggers the breakage. (In reply to Takashi Iwai from comment #32) > Have you upgraded BIOS on Dock, or applied anything since Leap 15.4? Just > wondering what triggers the breakage. Nothing that I am aware of. But I had done a mistake and downloaded the 15.4 GA Kernel. I will give later the L&G 15.4 a try. Tried all available Kernels so far. Any idea how to proceed? If the issue is still seen with the latest upstream kernel, you should report to the upstream devs and let them fix the bugs. Care to report from your side? It's a hardware-specific issue and better to involve you directly. For the graphics issue of AMDGPU, it'd be gitlab.freedesktop.org Issues. For the network, you can try to report to bugzilla.kernel.org, too. I don't know how to do that. Also according to co-workers, the Ubuntu Kernel (6.5) works fine with that hardware constellation. You need to send a bug report to the upstream bug tracker. e.g. for the graphics issue, at best, gitlab.freedesktop.org; choose the right component (e.g. DRM/Intel or such). For other issues (igc and PCI core), maybe you can report to bugzilla.kernel.org. The report from your side is the best as it's pretty much device-specific problem and you are the one who owns and has tested / suffered from the issue. After reporting, just let us know the URL, and I can join to the reports later for assisting from the distro side. |