Bug 1180985 - Kernel 5.10 breaks ethernet connection with lenovo x390
Kernel 5.10 breaks ethernet connection with lenovo x390
Status: NEW
Classification: openSUSE
Product: openSUSE Tumbleweed
Classification: openSUSE
Component: Kernel
Current
Other Other
: P5 - None : Normal (vote)
: ---
Assigned To: openSUSE Kernel Bugs
E-mail List
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2021-01-15 13:33 UTC by Flavio Castelli
Modified: 2022-02-14 10:39 UTC (History)
4 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---
mbenes: needinfo? (fcastelli)


Attachments
hwinfo (1.74 MB, text/x-log)
2021-01-15 16:35 UTC, Flavio Castelli
Details
dmesg (89.91 KB, text/x-log)
2021-01-15 16:35 UTC, Flavio Castelli
Details
hwinfo - 5.10.5 - took after the eth breaks (1.76 MB, text/x-log)
2021-01-18 08:24 UTC, Flavio Castelli
Details
journalctl logs - 5.10.5 - took after the eth breaks (26.88 KB, text/x-log)
2021-01-18 08:24 UTC, Flavio Castelli
Details
dmesg - 5.10.5 - took after the eth breaks (98.69 KB, text/x-log)
2021-01-18 08:25 UTC, Flavio Castelli
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Flavio Castelli 2021-01-15 13:33:52 UTC
I have a lenovo x390 laptop with an intel CPU. I've been running Tumbleweed on it since a while because I needed a newer kernel compared to Leap to get my network card to work.

Everything worked fine since I upgraded to 5.10.x. The upgrade caused the ethernet card to became unreliable (like it happened with old Leap 15.1 kernel).

The network card works fine for some time, then it suddenly loses the connection. I can see Gnome NetworkManager trying to setup the card, but the process never ends.

The x390 doesn't have a physical eth port, it has a proprietary adapter that provides that to the laptop. This adapter is plagued by the issue.

The same issue happens when I connect the computer to its USB-C docking station (an official one provided by Lenovo). The docking station provides an eth port which is affected by this issue.

The 5.9.14-1.2 and the 5.9.12-1.1 kernels are the last kernels I tested on my computer that proved to be fine.


I'll try to collect more detailed logs from syslog, but right now it's a bit hard for me because I cannot reproduce the issue. I just have to wait for it to happen...
Comment 1 Takashi Iwai 2021-01-15 14:05:21 UTC
Which kernel did it happen exactly?  In anyway, please give hwinfo output and dmesg output from the latest running 5.10.x kernel.
(And of course any detailed logs showing the problem :)
Comment 2 Flavio Castelli 2021-01-15 16:34:48 UTC
Someone pointed me to https://bugzilla.suse.com/show_bug.cgi?id=1180344#c16 - I think I can spot a lot of similarities.

I've just rebooted with the workaround of comment #16, let's see how it goes.

In the meantime I'll attach the hwinfo and the dmesg outputs.

I've looked into journalctl but I couldn't find anything strange, I'll keep monitoring that.

One final note: the network stopped working also with the WIFI connection turned on. So maybe it's not just an ethernet issue?
Comment 3 Flavio Castelli 2021-01-15 16:35:16 UTC
Created attachment 845184 [details]
hwinfo
Comment 4 Flavio Castelli 2021-01-15 16:35:37 UTC
Created attachment 845185 [details]
dmesg
Comment 5 Takashi Iwai 2021-01-15 16:50:26 UTC
If the problem was seen on the early 5.10.x, it's possibly the bug of iwlwifi.  This has been addressed on 5.10.4.  So you don't need that workaround.

OTOH, the 5.10.4 kernel got another bug about HDMI and this might lead to another problem.  It was already fixed on the later kernel (5.10.5 should be fine).
But if you want the workaround to be sure, just pass
  snd_hda_codec_hdmi.enable_silent_stream=0
as boot option.  The relevant bugs are bug 1180543 and bug 1180563.
Comment 6 Flavio Castelli 2021-01-18 08:22:38 UTC
During the weekend I booted my laptop with kernel-default-5.10.5-1.1.x86_64 plus the workaround from https://bugzilla.suse.com/show_bug.cgi?id=1180985#c16.

I left the laptop turned on the whole night, and in the morning I found the eth connection stuck into the endless loop of acquiring an IP.
Enabling the WIFI card didn't help, for some reasons it couldn't get the IP address.

Regardless of the network issue, the 5.10.5 kernel didn't allow the machine to enter sleep.


I'm going to attach the dmesg and hwinfo logs I took in the morning once I realized the computer was affected by the bug.

The journalctl logs were full of NetworkManager logs, plus gnome shell complaining about some JS recursion (I **suspect** that could be caused by the nextcloud system tray icon that reports when the client can't connect to the remote server).

I'll attach a small portion of the journalctl logs. Let me know if there's something specific I should look for. Note well: the log has been printed in reverse mode.
Comment 7 Flavio Castelli 2021-01-18 08:23:35 UTC
I forgot to mention I've now upraded to 5.10.7. The suspend issue seems to be gone, but I fear the eth one is still around.

Note well: I've now removed the workaround from https://bugzilla.suse.com/show_bug.cgi?id=1180985#c16
Comment 8 Flavio Castelli 2021-01-18 08:24:15 UTC
Created attachment 845217 [details]
hwinfo - 5.10.5 - took after the eth breaks
Comment 9 Flavio Castelli 2021-01-18 08:24:53 UTC
Created attachment 845218 [details]
journalctl logs - 5.10.5 - took after the eth breaks

Note well: this is printed in reverse mode
Comment 10 Flavio Castelli 2021-01-18 08:25:11 UTC
Created attachment 845219 [details]
dmesg - 5.10.5 - took after the eth breaks
Comment 11 Flavio Castelli 2021-01-18 08:30:54 UTC
Just for completeness: earlier this morning I "lost" the eth connection while running on 5.10.7

The laptop is currently connected to the docking station, the eth connection is provided by the docking station.

I'm still running on the system, I just disabled via networkmanager the eth connection and I'm relying on the wifi.

These are some logs I've just seen on the journal:


Jan 18 09:28:03 ganymede kernel: restoring control 00000000-0000-0000-0000-000000000101/12/11
Jan 18 09:28:03 ganymede kernel: restoring control 00000000-0000-0000-0000-000000000101/10/5
Jan 18 09:28:02 ganymede kernel: usb 1-2.1.4: reset high-speed USB device number 16 using xhci_hcd
Jan 18 09:27:50 ganymede kernel: restoring control 00000000-0000-0000-0000-000000000101/12/11
Jan 18 09:27:50 ganymede kernel: restoring control 00000000-0000-0000-0000-000000000101/10/5
Jan 18 09:27:50 ganymede kernel: usb 1-2.1.4: reset high-speed USB device number 16 using xhci_hcd
Jan 18 09:27:43 ganymede firefox.desktop[4986]: [ERROR glean_core::upload] Unrecoverable upload failure while attempting to send ping 01c167a3-95b1-4e1f-a9da-e24598c2c538. Error was UnrecoverableFailure
Jan 18 09:27:38 ganymede kernel: restoring control 00000000-0000-0000-0000-000000000101/12/11
Jan 18 09:27:38 ganymede kernel: restoring control 00000000-0000-0000-0000-000000000101/10/5
Jan 18 09:27:37 ganymede kernel: usb 1-2.1.4: reset high-speed USB device number 16 using xhci_hcd
Jan 18 09:25:40 ganymede systemd[3933]: vte-spawn-1c66fa47-8eee-4d44-b0d1-9e0269b60abb.scope: Succeeded.
Jan 18 09:25:31 ganymede systemd[1]: systemd-hostnamed.service: Succeeded.
Jan 18 09:25:03 ganymede systemd[3933]: Started Tracker metadata database store and lookup manager.
Jan 18 09:25:03 ganymede dbus-daemon[3979]: [session uid=1000 pid=3979] Successfully activated service 'org.freedesktop.Tracker1'
Jan 18 09:25:03 ganymede systemd[3933]: Starting Tracker metadata database store and lookup manager...
Jan 18 09:25:03 ganymede dbus-daemon[3979]: [session uid=1000 pid=3979] Activating via systemd: service name='org.freedesktop.Tracker1' unit='tracker-store.service' requested by ':1.70' (uid=1000 pid=4986 comm="/usr/lib64/firefox/firefox>
Jan 18 09:25:01 ganymede systemd[1]: Started Hostname Service.
Jan 18 09:25:01 ganymede dbus-daemon[606]: [system] Successfully activated service 'org.freedesktop.hostname1'
Jan 18 09:25:01 ganymede systemd[1]: Starting Hostname Service...
Jan 18 09:25:01 ganymede dbus-daemon[606]: [system] Activating via systemd: service name='org.freedesktop.hostname1' unit='dbus-org.freedesktop.hostname1.service' requested by ':1.186' (uid=1000 pid=4986 comm="/usr/lib64/firefox/firefox >
Jan 18 09:24:53 ganymede systemd[1]: systemd-hostnamed.service: Succeeded.
Jan 18 09:24:23 ganymede systemd[1]: Started Hostname Service.
Jan 18 09:24:23 ganymede dbus-daemon[606]: [system] Successfully activated service 'org.freedesktop.hostname1'
Jan 18 09:24:23 ganymede systemd[1]: Starting Hostname Service...
Jan 18 09:24:23 ganymede dbus-daemon[606]: [system] Activating via systemd: service name='org.freedesktop.hostname1' unit='dbus-org.freedesktop.hostname1.service' requested by ':1.184' (uid=1000 pid=4986 comm="/usr/lib64/firefox/firefox >
Jan 18 09:24:13 ganymede systemd[1]: systemd-hostnamed.service: Succeeded.
Jan 18 09:23:43 ganymede systemd[1]: Started Hostname Service.
Jan 18 09:23:43 ganymede dbus-daemon[606]: [system] Successfully activated service 'org.freedesktop.hostname1'
Jan 18 09:23:43 ganymede systemd[1]: Starting Hostname Service...
Jan 18 09:23:43 ganymede dbus-daemon[606]: [system] Activating via systemd: service name='org.freedesktop.hostname1' unit='dbus-org.freedesktop.hostname1.service' requested by ':1.182' (uid=1000 pid=4986 comm="/usr/lib64/firefox/firefox >
Jan 18 09:21:44 ganymede NetworkManager[808]: <info>  [1610958104.9902] device (enp0s20f0u2u1u2): state change: unavailable -> disconnected (reason 'carrier-changed', sys-iface-state: 'managed')
Jan 18 09:21:44 ganymede kernel: r8152 2-2.1.2:1.0 enp0s20f0u2u1u2: carrier on
Jan 18 09:21:44 ganymede NetworkManager[808]: <info>  [1610958104.9873] device (enp0s20f0u2u1u2): carrier: link connected
Jan 18 09:21:44 ganymede kernel: IPv6: ADDRCONF(NETDEV_CHANGE): enp0s20f0u2u1u2: link becomes ready
Jan 18 09:21:44 ganymede ModemManager[723]: <info>  [base-manager] couldn't check support for device '/sys/devices/pci0000:00/0000:00:14.0/usb2/2-2/2-2.1/2-2.1.2': not supported by any plugin
Jan 18 09:21:42 ganymede systemd-udevd[31627]: Using default interface naming scheme 'v245'.
Jan 18 09:21:42 ganymede NetworkManager[808]: <info>  [1610958102.2806] device (enp0s20f0u2u1u2): state change: unmanaged -> unavailable (reason 'managed', sys-iface-state: 'external')
Jan 18 09:21:42 ganymede NetworkManager[808]: <info>  [1610958102.2189] device (eth0): interface index 20 renamed iface from 'eth0' to 'enp0s20f0u2u1u2'
Comment 12 Roger Whittaker 2021-01-24 14:52:31 UTC
I am seeing something similar to this with a Dell Precision 5530 connecting via a WD15 docking station, since upgrading to 5.10.7-1-default.

I have a cron job set up to check connectivity with a ping, and in the logs I see a usb reset, and then the next time the ping test happens, we see the failure.

Jan 24 01:24:43 tweedledum kernel: usb 4-1.2: reset SuperSpeed Gen 1 USB device number 3 using xhci_hcd
Jan 24 01:24:43 tweedledum kernel: r8152 4-1.2:1.0 eth0: Using pass-thru MAC addr e4:b9:7a:7c:cb:af

[...]

Jan 24 01:30:01 tweedledum CRON[22069]: (root) CMD (/root/bin/pingtest.sh)
Jan 24 01:30:01 tweedledum root[22073]: ping test
Jan 24 01:30:07 tweedledum root[22081]: pinging 192.168.1.1: test FAILED

With the system in this state, when I attempted to restart the network. the command `rcnetwork restart' hung indefinitely.

This has happened twice on consecutive days.

I'm now testing to see whether the kernel boot parameter usbcore.autosuspend=-1 helps stop this happening.
Comment 13 Takashi Iwai 2021-02-03 15:59:44 UTC
I'm building a test kernel with a couple of downstream patches floating around.  Please check it whether it helps for anything; it's a complete blind shot, and I'm not entirely sure whether it's relevant, though.

The kernel package is being built in OBS home:tiwai:bsc1180985 repo.
Comment 14 Miroslav Beneš 2022-02-11 12:40:21 UTC
Flavio, Roger, any news here? It has been a while and TW is now on v5.16.
Comment 15 Roger Whittaker 2022-02-11 16:14:07 UTC
I upgraded the firmware on the docking station (I had to connect it to a Windows system to do this) and then this stopped happening...

Sorry I forgot to report this here.
Comment 16 Miroslav Beneš 2022-02-14 10:39:08 UTC
Thanks for the feedback.

Flavio, could you confirm the firmware upgrade helps also in your case?