Bug 1178275 - usb3 device generates lots of command errors
usb3 device generates lots of command errors
Status: RESOLVED FIXED
Classification: openSUSE
Product: openSUSE Tumbleweed
Classification: openSUSE
Component: Kernel
Current
x86-64 openSUSE Tumbleweed
: P5 - None : Normal (vote)
: ---
Assigned To: openSUSE Kernel Bugs
E-mail List
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2020-10-29 17:51 UTC by James Moe
Modified: 2022-01-24 08:50 UTC (History)
6 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description James Moe 2020-10-29 17:51:13 UTC
tumbleweed 20201022

I am using a 2TB Seagate external drive (2r1apl-500) as a backup device; backintime is the backup app. It is formatted with XFS.

The system journal contains lots of the errors shown below; there are 5 groups of such errors each time a backup starts. After it gets through the errors sets, the backup proceeds normally. Backup sets are normal.

I had an earlier Seagate disk drive (1.5 TB) for backup, formatted with BTRFS. It showed none of these difficulties.

Further I have included some errors from startup. The OS seems to have a hard time with the disk drive from the beginning. The difficulty continues after boot is complete; eventually the device is considered functional, mostly.

I cannot tell if this is a hardware problem (USB interface, disk drive), or an XFS filesystem issue.

What other information is needed?

----[ usb3 disk access errors ]----
2020-10-29T06:00:02-0700 sma-station14l python3[29320]: backintime (jmoe/1): INFO: Call rsync to take the snapshot

2020-10-29T06:03:35-0700 sma-station14l kernel: sd 7:0:0:0: [sde] tag#29 uas_eh_abort_handler 0 uas-tag 29 inflight: CMD OUT 
2020-10-29T06:03:35-0700 sma-station14l kernel: sd 7:0:0:0: [sde] tag#29 CDB: Write(10) 2a 00 04 2c d6 20 00 04 00 00
2020-10-29T06:03:36-0700 sma-station14l kernel: sd 7:0:0:0: [sde] tag#28 uas_eh_abort_handler 0 uas-tag 6 inflight: CMD OUT 
... blah blah blah ...
2020-10-29T06:03:38-0700 sma-station14l kernel: sd 7:0:0:0: [sde] tag#0 uas_eh_abort_handler 0 uas-tag 2 inflight: CMD OUT 
2020-10-29T06:03:38-0700 sma-station14l kernel: sd 7:0:0:0: [sde] tag#0 CDB: Write(10) 2a 00 04 2c ca 28 00 03 fa 00
2020-10-29T06:03:38-0700 sma-station14l kernel: scsi host7: uas_eh_device_reset_handler start
2020-10-29T06:03:38-0700 sma-station14l kernel: usb 3-4: reset SuperSpeed Gen 1 USB device number 14 using xhci_hcd
2020-10-29T06:03:38-0700 sma-station14l kernel: scsi host7: uas_eh_device_reset_handler success

2020-10-29T06:04:00-0700 sma-station14l python3[29320]: backintime (jmoe/1): INFO: Save config file
2020-10-29T06:04:00-0700 sma-station14l python3[29320]: backintime (jmoe/1): INFO: Save permissions
2020-10-29T06:04:06-0700 sma-station14l python3[29320]: backintime (jmoe/1): INFO: Create info file
2020-10-29T06:04:07-0700 sma-station14l python3[29320]: backintime (jmoe/1): INFO: [smart remove] remove snapshots: [20201022-060002-585]
----[ end ]----

----[ from dmesg (system boot) ]----
[   19.788644] XFS (sde1): Mounting V5 Filesystem
[   19.840770] EXT4-fs (sdg1): mounted filesystem with ordered data mode. Opts: data=ordered
[   19.862152] EXT4-fs (sdf1): mounted filesystem with ordered data mode. Opts: data=ordered
[   20.493479] XFS (sde1): Ending clean mount

[  442.856937] usb usb3-port2: Cannot enable. Maybe the USB cable is bad?
[  448.672230] usb 3-2: Device not responding to setup address.
[  448.934389] usb 3-2: Device not responding to setup address.
[  449.140637] usb 3-2: device not accepting address 43, error -71
[  449.336706] usb 3-2: Device not responding to setup address.
[  449.544732] usb 3-2: Device not responding to setup address.
[  449.752681] usb 3-2: device not accepting address 44, error -71
[  449.752823] usb usb3-port2: attempt power cycle
[  450.264711] usb 3-2: Device not responding to setup address.
[  450.472717] usb 3-2: Device not responding to setup address.
[  450.680667] usb 3-2: device not accepting address 45, error -71
[  450.876704] usb 3-2: Device not responding to setup address.
[  451.084705] usb 3-2: Device not responding to setup address.
[  451.131079] XFS (sde1): Unmounting Filesystem
[  451.292664] usb 3-2: device not accepting address 46, error -71
[  451.292812] usb usb3-port2: unable to enumerate USB device
[  471.718753] usb 3-2: Device not responding to setup address.
[  471.926647] usb 3-2: Device not responding to setup address.
[  472.134552] usb 3-2: device not accepting address 47, error -71
[  472.334520] usb 3-2: Device not responding to setup address.
[  472.542435] usb 3-2: Device not responding to setup address.
[  472.750369] usb 3-2: device not accepting address 48, error -71
[  472.750521] usb usb3-port2: attempt power cycle
[  473.266237] usb 3-2: Device not responding to setup address.
[  473.474149] usb 3-2: Device not responding to setup address.
[  473.682014] usb 3-2: device not accepting address 49, error -71
[  474.709684] usb 3-2: Device not responding to setup address.
[  474.917636] usb 3-2: Device not responding to setup address.
[  475.125480] usb 3-2: device not accepting address 50, error -71
[  475.125551] usb usb3-port2: unable to enumerate USB device
[  475.581560] usb 3-2: new SuperSpeed Gen 1 USB device number 51 using xhci_hcd
----[ end ]----
Comment 1 Chenzi Cao 2020-11-04 11:51:28 UTC
Assign to kernel team to take a look at this, please feel free to reassign, thanks.
Comment 2 Takashi Iwai 2020-11-05 09:32:11 UTC
So it's a device-specific problem.  As mandatory questions:
- Does the problem happen with every USB port?
- Is the power sufficient for the device?

Adding USB guys to Cc in anyway.
Comment 3 James Moe 2020-11-05 18:29:37 UTC
> - Does the problem happen with every USB port?
> - Is the power sufficient for the device?

Yes.
Probably. I have moved the disk drive to different ports: two on an add-in card for (4) usb3 ports, one on a mainboard usb3 port. Same results.
Comment 4 James Moe 2020-11-09 20:30:21 UTC
The issue is apparently caused by bad hardware.

I replaced the existing add-in card with another purchased for different computer. None of the errors listed in the OP have not recurred since the replacement.
Comment 5 James Moe 2020-11-11 19:32:06 UTC
I was incorrect about there being no more spurious messages. The following was recorded in the journal last night, the first USB-related errors in several days.

2020-11-11T06:00:02-0700 sma-station14l python3[32059]: backintime (jmoe/1): INFO: Call rsync to take the snapshot
2020-11-11T06:01:02-0700 sma-station14l systemd[1]: snapperd.service: Succeeded.
2020-11-11T06:01:06-0700 sma-station14l kernel: sd 6:0:0:0: [sdd] tag#29 uas_eh_abort_handler 0 uas-tag 30 inflight: CMD OUT 
2020-11-11T06:01:06-0700 sma-station14l kernel: sd 6:0:0:0: [sdd] tag#29 CDB: Write(10) 2a 00 3b c2 25 38 00 04 00 00
...
2020-11-11T06:01:08-0700 sma-station14l kernel: sd 6:0:0:0: [sdd] tag#0 uas_eh_abort_handler 0 uas-tag 26 inflight: CMD OUT 
2020-11-11T06:01:08-0700 sma-station14l kernel: sd 6:0:0:0: [sdd] tag#0 CDB: Write(10) 2a 00 3b c2 15 38 00 04 00 00
2020-11-11T06:01:08-0700 sma-station14l kernel: scsi host6: uas_eh_device_reset_handler start
2020-11-11T06:01:08-0700 sma-station14l kernel: usb 3-1: reset SuperSpeed Gen 1 USB device number 2 using xhci_hcd
2020-11-11T06:01:08-0700 sma-station14l kernel: scsi host6: uas_eh_device_reset_handler success

It could still be a problem with the hardware, the disk drive in this case?
Comment 6 Takashi Iwai 2020-11-16 16:28:05 UTC
That's possible, but it's difficult to judge remotely...

Just as a blind shot: could you try the kernel at
  http://download.opensuse.org/repositories/home:/tiwai:/bsc1149871-tw/standard/
?

It's a test kernel containing a revert of a USB commit that apparently caused problems for some people.  I don't think this is the same problem as yours, but just to be sure.
Comment 7 Gus Fos 2020-11-25 08:25:59 UTC
I think this is the issue that is affecting me as well. I have a USB EV Cert token from Sectigo, that I have mounted to a Windows 10 VM running in KVM. It has worked flawlessly for a long time, but after I went from October 14 update to November 19 snapshot, it has stopped working. 

This is how it is mounted to KVM and is the same config that I have run for a year:

<hostdev mode="subsystem" type="usb" managed="yes">
  <source startupPolicy="optional">
    <vendor id="0x0529"/>
    <product id="0x0620"/>
  </source>
</hostdev>


This is what I get from dmesg when I plug in the USB and then start the VM:

[55212.993512] usb 1-2: new full-speed USB device number 15 using xhci_hcd
[55213.147159] usb 1-2: New USB device found, idVendor=0529, idProduct=0620, bcdDevice= 0.01
[55213.147164] usb 1-2: New USB device strings: Mfr=1, Product=2, SerialNumber=0
[55213.147166] usb 1-2: Product: Token JC
[55213.147169] usb 1-2: Manufacturer: SafeNet
[55289.253516] virbr0: port 1(vnet5) entered blocking state
[55289.253520] virbr0: port 1(vnet5) entered disabled state
[55289.253666] device vnet5 entered promiscuous mode
[55289.254320] virbr0: port 1(vnet5) entered blocking state
[55289.254324] virbr0: port 1(vnet5) entered listening state
[55291.269169] virbr0: port 1(vnet5) entered learning state
[55293.285164] virbr0: port 1(vnet5) entered forwarding state
[55293.285166] virbr0: topology change detected, propagating
[55294.497605] usb 1-2: reset full-speed USB device number 15 using xhci_hcd
[55295.509558] usb 1-2: reset full-speed USB device number 15 using xhci_hcd
[55296.681547] usb 1-2: reset full-speed USB device number 15 using xhci_hcd
[55297.961482] usb 1-2: reset full-speed USB device number 15 using xhci_hcd

And then Windows reports Device Manager reports it as "Unknown USB Deice (Invalid Configuration Descriptor)" Code 43.

I have tested it directly on a Windows 10 laptop and then it works fine. Also tested different ports and same issue on all ports, and other USB devices still work fine on all ports, so it is not hardware related. I have also tested to change USB controllers and different configs for the VM, but nothing helps. 

So I'm pretty sure this must be from an update during that time period that have broken this.
Comment 8 Gus Fos 2020-11-25 08:58:30 UTC
I just tested to update to snapshot 20201123, and the issue is still there.
Comment 9 Takashi Iwai 2020-11-25 09:09:59 UTC
(In reply to Gus Fos from comment #8)
> I just tested to update to snapshot 20201123, and the issue is still there.

Did you test the kernel in comment 6?
Comment 10 Gus Fos 2020-11-25 09:15:52 UTC
Unfortunately, I don't have the knowledge to do that and I only have one machine running Tumbleweed which is my work machine. So can't really experiment on it :(
Comment 11 James Moe 2020-11-25 17:47:59 UTC
> Just as a blind shot: could you try the kernel at
>  http://download.opensuse.org/repositories/home:/tiwai:/bsc1149871-tw/standard/
> ?
Well, no. This host is not for that level of experimentation (even though I have Tumbleweed). I have had bad luck with experimental installations: Tricky to install, and do not un-install cleanly.
Comment 12 Gus Fos 2020-12-10 18:03:48 UTC
Just updated to 20201207 and this issue is still present.
Comment 13 Miroslav Beneš 2022-01-21 12:51:23 UTC
Forgotten one, sorry about that. Is the issue still present with the latest TW snapshot?
Comment 14 James Moe 2022-01-21 18:36:11 UTC
tumbleweed 20220113

I do not know when the errors stopped.
It is no longer an issue here with any drive.
Comment 15 Miroslav Beneš 2022-01-24 08:50:06 UTC
Thanks for the feedback. Closing then.