Bug 1217492

Summary: KVM vm client can no longer use a USB pass through device
Product: [openSUSE] openSUSE Distribution Reporter: P <opensuse>
Component: KVMAssignee: Lin Ma <lma>
Status: NEW --- QA Contact: E-mail List <qa-bugs>
Severity: Normal    
Priority: P5 - None CC: dfaggioli, lma, meissner, opensuse, qe-virt, xlai
Version: Leap 15.5   
Target Milestone: ---   
Hardware: x86-64   
OS: openSUSE Leap 15.5   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: VM configuration using Conbee II as a serial device.
VM configuration using Conbee II as a USB device.
/var/log/libvirt/qemu/USB_Test.log
/var/log/libvirt/qemu/USB_Test.log
lsusb commands output
A test program
test program v2
The systemtap script v1 for debug
script-v1.stp.log
test_libusb_get_active_config_desc-v2.log

Description P 2023-11-25 05:47:43 UTC
After upgrading a VM host from Leap 15.2 to Leap 15.5 a VM client can no longer use a pass-through USB device. The syslog of the VM host is flooded with:

> nov 08 19:59:46 vmhost kernel: usb 1-3: usbfs: process 1729 (qemu-system-x86) did not claim interface 1 before use

The flooding is at 1 second interval.

Because of the upgrade the vm host has gotten both a new kernel version and a new libvirt version. The vm client has not been touched.

The USB device in question is a Conbee II Zigbee stick (serial device).

I progressed through the normal upgrade path (all in one day):
- Leap 15.2: USB pass through: OK (this was my baseline)
- Leap 15.3: Not checked
- Leap 15.4: USB pass through: NOT OK
- Leap 15.5: USB pass through: NOT OK

I've checked with the libvirt developers (see https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/message/57HHC4PMSM7LPGCDGJTHJWA3FV4UDAMA/), and it was suggested the openSUSE kernel is at fault.
Comment 1 Dario Faggioli 2023-11-27 13:49:57 UTC
Ok, can you show use the configuration of the VM?

Can you also try to create a new VM (assuming that the one that is having the problem has not been re-created, which is fine), assign a USB device to it, and see if it works?
Comment 2 P 2024-01-21 14:01:31 UTC
My apologies for the late reply. I switched ISP and had an email blackout. So I missed the email triggered from your update. I just noticed your question.
I need some time to provide the information you've asked for.
Comment 3 Lin Ma 2024-01-26 06:18:49 UTC
(In reply to P from comment #0)
> ......
> The USB device in question is a Conbee II Zigbee stick (serial device).
> 
> I progressed through the normal upgrade path (all in one day):
> - Leap 15.2: USB pass through: OK (this was my baseline)
> - Leap 15.3: Not checked
> - Leap 15.4: USB pass through: NOT OK
> - Leap 15.5: USB pass through: NOT OK

I had a fresh leap 15.5 vm with a passed-through USB mass storage device on a leap 15.5 host, The USB mass storage device works well in the leap 15.5 vm.
So the issue seems to be specific to certain USB devices(say USB serial devices), rather than a generic issue.

We know that the libusb uses usbfs to communicate with USB devices.
As a userspace application, The qemu uses libusb to access USB devices. 
IMO the libusb is the first likely suspect, qemu is the second.

The default libusb version of leap 15.5 is v1.0.24, the default libusb version of leap 15.2 is v1.0.21.
I built libusb v1.0.21 for leap 15.5 in my home repo through obs.
Could you add my repo https://download.opensuse.org/repositories/home:/lin_ma:/branches:/openSUSE:/Leap:/15.5/standard/ in your leap 15.5 host, then degrade your current libusb to v1.0.21 and test again?

The expected results are:
1. The Conbee II Zigbee stick can be passed-through successfully into your VM.
2. The Conbee II Zigbee stick is functional in your VM.


If the above test got negative result, Could you please help to run the below test?
In host:
IIRC, Usually the cdc_acm module claims the USB serial interface(s) on host.
Could you please try to unload the cdc_acm module or blacklist it to avoid claiming the Conbee II Zigbee stick on your leap 15.5 host, Then start your VM and test.
Comment 4 P 2024-01-26 10:26:22 UTC
(In reply to Lin Ma from comment #3)
> Could you add my repo
> https://download.opensuse.org/repositories/home:/lin_ma:/branches:/openSUSE:/
> Leap:/15.5/standard/ in your leap 15.5 host, then degrade your current
> libusb to v1.0.21 and test again?

Unfortunately that didn't work. I couldn't get the vm to start with the downgraded 1.0.21:

> jan 26 11:23:48 vmhost libvirtd[1651]: Unable to read from monitor: Verbinding is weggevallen
> jan 26 11:23:48 vmhost libvirtd[1651]: internal error: qemu unexpectedly closed the monitor: 2024-01-26T10:23:48.875637Z qemu-system-x86_64: -device {"driver":"usb-host","hostdevice":"/dev/bus/usb/001/003","id":"hostdev0","bus":"usb.0","port":"4"}: failed to open module: /usr/bin/../lib64/qemu/hw-usb-host.so: undefined symbol: libusb_set_option
>                                        2024-01-26T10:23:48.875721Z qemu-system-x86_64: -device {"driver":"usb-host","hostdevice":"/dev/bus/usb/001/003","id":"hostdev0","bus":"usb.0","port":"4"}: 'usb-host' is not a valid device model name
Comment 5 P 2024-01-26 13:33:29 UTC
(In reply to Dario Faggioli from comment #1)
> Ok, can you show use the configuration of the VM?
> 
> Can you also try to create a new VM (assuming that the one that is having
> the problem has not been re-created, which is fine), assign a USB device to
> it, and see if it works?

I've attached the configuration as homeassistant.xml
With help from the interwebs I was able to connect the Conbee II as serial device to the VM (you'll see evidence of that in the configuration file). That did the trick as a workaround. So in the VM I can now use the Conbee II with a specific serial configuration setting in HomeAssistant.

I've created a new VM with the latest release of HomeAssisant. The outcome is roughly the same. Even though the log is no longer flooded,  an almost equal error message is generated multiple times*. And the Conbee II cannot be used from within the VM. I've attached this configuration file as USB_Test.xml

*)
> jan 26 13:14:49 vmhost kernel: usb 1-3: reset full-speed USB device number 3 using xhci_hcd
> jan 26 13:14:49 vmhost kernel: cdc_acm 1-3:1.0: ttyACM0: USB ACM device
> jan 26 13:14:52 vmhost kernel: usb 1-3: usbfs: process 1688 (CPU 0/KVM) did not claim interface 0 before use
> jan 26 13:14:52 vmhost kernel: usb 1-3: usbfs: process 1688 (CPU 0/KVM) did not claim interface 0 before use
> jan 26 13:14:52 vmhost kernel: usb 1-3: usbfs: process 1688 (CPU 0/KVM) did not claim interface 0 before use
> jan 26 13:14:52 vmhost kernel: usb 1-3: usbfs: process 1688 (CPU 0/KVM) did not claim interface 0 before use
> jan 26 13:14:52 vmhost kernel: usb 1-3: usbfs: process 1688 (CPU 0/KVM) did not claim interface 0 before use
> jan 26 13:14:52 vmhost kernel: usb 1-3: usbfs: process 1688 (CPU 0/KVM) did not claim interface 0 before use
Comment 6 P 2024-01-26 13:35:28 UTC
Created attachment 872228 [details]
VM configuration using Conbee II as a serial device.
Comment 7 P 2024-01-26 13:36:01 UTC
Created attachment 872230 [details]
VM configuration using Conbee II as a USB device.
Comment 8 P 2024-01-27 10:31:43 UTC
There are some related posts which you might find interesting. The cross-section here is the relation with a Conbee II:

search://"conbee" libusb did not claim interface 0 before use
Comment 9 Lin Ma 2024-01-27 12:55:16 UTC
(In reply to P from comment #7)
> Created attachment 872230 [details]
> VM configuration using Conbee II as a USB device.

Could you please help to capture the trace data and upload it?

1. Make sure the vm USB_Test is off.
2. On host, issues below commands as a priviledged user:
lsusb -t
virsh start USB_Test --paused
virsh qemu-monitor-command USB_Test --hmp "trace-event usb_host_claim_interface on"
virsh resume USB_Test
lsusb -t

Then upload the outputs of the two lsusb commands and the /var/log/libvirt/qemu/USB_Test.log

TIA
Comment 10 P 2024-01-27 21:42:03 UTC
(In reply to Lin Ma from comment #9)

> Could you please help to capture the trace data and upload it?


lsusb -t (pre-start):
> vmhost:~ # lsusb -t
> /:  Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/7p, 5000M
> /:  Bus 01.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/9p, 480M
>     |__ Port 1: Dev 2, If 0, Class=Vendor Specific Class, Driver=ftdi_sio, 12M
>     |__ Port 3: Dev 3, If 0, Class=Communications, Driver=cdc_acm, 12M
>     |__ Port 3: Dev 3, If 1, Class=CDC Data, Driver=cdc_acm, 12M
>     |__ Port 4: Dev 4, If 0, Class=Vendor Specific Class, Driver=ftdi_sio, 12M

Attached: /var/log/libvirt/qemu/USB_Test.log

lsusb -t (post-resume):
> vmhost:~ # lsusb -t
> /:  Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/7p, 5000M
> /:  Bus 01.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/9p, 480M
>     |__ Port 1: Dev 2, If 0, Class=Vendor Specific Class, Driver=ftdi_sio, 12M
>     |__ Port 3: Dev 3, If 0, Class=Communications, Driver=cdc_acm, 12M
>     |__ Port 3: Dev 3, If 1, Class=CDC Data, Driver=cdc_acm, 12M
>     |__ Port 4: Dev 4, If 0, Class=Vendor Specific Class, Driver=ftdi_sio, 12M
Comment 11 P 2024-01-27 21:43:23 UTC
Created attachment 872240 [details]
/var/log/libvirt/qemu/USB_Test.log
Comment 12 Lin Ma 2024-01-29 08:42:08 UTC
Emm...It looks like the relevant code paths have not been touched at all, the Conbee II stick hasn't been unbinded from cdc_acm driver in vmhost kernel, and qemu hasn't invoke the function usb_host_claim_interface either.
Could you please run below test to collect more trace data?

1. Make sure the vm USB_Test is off.
2. Create a usb hostdev xml like this:
e.g: cat hostdev_usb_ConbeeII.xml
    <hostdev mode='subsystem' type='usb' managed='no'>
      <source>
        <vendor id='0x1cf1'/>
        <product id='0x0030'/>
      </source>
    </hostdev>

3. Remove the hostdev section from vm USB_Test configuration.

4. virsh start USB_Test

5. Add below trace events:
virsh qemu-monitor-command USB_Test --hmp "trace-event usb_host_close on"
virsh qemu-monitor-command USB_Test --hmp "trace-event usb_host_reset on"
virsh qemu-monitor-command USB_Test --hmp "trace-event usb_host_auto_scan_enabled on"
virsh qemu-monitor-command USB_Test --hmp "trace-event usb_host_open_* on"
virsh qemu-monitor-command USB_Test --hmp "trace-event usb_host_parse_* on"
virsh qemu-monitor-command USB_Test --hmp "trace-event usb_host_detach_kernel on"
virsh qemu-monitor-command USB_Test --hmp "trace-event usb_host_attach_kernel on"
virsh qemu-monitor-command USB_Test --hmp "trace-event usb_host_claim_interface on"
virsh qemu-monitor-command USB_Test --hmp "trace-event usb_host_release_interface on"
virsh qemu-monitor-command USB_Test --hmp "trace-event usb_host_set_* on"
virsh qemu-monitor-command USB_Test --hmp "trace-event usb_host_req_* on"

6. Issues below commands:
lsusb -d 1cf1:0030 -v
lsusb -t
virsh attach-device USB_Test hostdev_usb_ConbeeII.xml
lsusb -t

7. Attach the outputs of step#6 and the /var/log/libvirt/qemu/USB_Test.log
Comment 13 P 2024-01-30 17:44:52 UTC
Thanks for your effort!

Please find attached:
- /var/log/libvirt/qemu/USB_Test.log
- lsusb.log
Comment 14 P 2024-01-30 17:45:46 UTC
Created attachment 872322 [details]
/var/log/libvirt/qemu/USB_Test.log
Comment 15 P 2024-01-30 17:46:44 UTC
Created attachment 872323 [details]
lsusb commands output
Comment 16 Lin Ma 2024-02-01 09:37:12 UTC
Created attachment 872366 [details]
A test program
Comment 17 Lin Ma 2024-02-01 09:40:58 UTC
(In reply to P from comment #14)
> Created attachment 872322 [details]
> /var/log/libvirt/qemu/USB_Test.log

Thanks for the information.

The USB_Test.log looks missing some of important and basic trace events. E.g: usb_host_detach_kernel, usb_host_claim_interface and usb_host_parse_config.

Likely reason is:
when qemu tried to get the configuration descriptor for this usb device(1cf1:0030) by calling function libusb_get_active_config_descriptor, The function returned a non-zero value.
If this guess is correct, that means there is something wrong while libusb reads the basic information of this particular usb device, and we should look for a possible fix from libusb's angle.

Could you please help to run this test to observe the output?
1. make sure that package gcc and package libusb-1_0-devel are installed on your leap 15.5 host, and the version of libusb is v1.0.24.
2. Compile the attached .c file and run the generated binary as a priviledged user:
gcc test_libusb_get_active_config_desc.c -o test_libusb_get_active_config_desc -lusb-1.0 && \
./test_libusb_get_active_config_desc


The correct output looks like:
rc = 0
Active Configuration:
  Configuration Value: 1
  Number of Interfaces: 1

But the expected output for your usb device(1cf1:0030) against libusb v1.0.24 should look like:
rc = a negative number
Error getting active configuration descriptor
Comment 18 P 2024-02-01 19:36:48 UTC
I'm afraid the challenges are greater than expected. The output is:

> vmhost:~/Bugzilla_1217492 # ./test_libusb_get_active_config_desc 
> LIBUSB_SUCCESS             =   0
> LIBUSB_ERROR_IO            =  -1
> LIBUSB_ERROR_INVALID_PARAM =  -2
> LIBUSB_ERROR_ACCESS        =  -3
> LIBUSB_ERROR_NO_DEVICE     =  -4
> LIBUSB_ERROR_NOT_FOUND     =  -5
> LIBUSB_ERROR_BUSY          =  -6
> LIBUSB_ERROR_TIMEOUT       =  -7
> LIBUSB_ERROR_OVERFLOW      =  -8
> LIBUSB_ERROR_PIPE          =  -9
> LIBUSB_ERROR_INTERRUPTED   = -10
> LIBUSB_ERROR_NO_MEM        = -11
> LIBUSB_ERROR_NOT_SUPPORTED = -12
> LIBUSB_ERROR_OTHER         = -99
> rc = 0
> Active Configuration:
>   Configuration Value: 1
>   Number of Interfaces: 2

Just to confirm:
- My vmhost is a bare minimum configuration. So I can't run development stuff on it. I set up a development vm specifically for this test. Installed from the Leap 15.5 DVD ISO, zypper update 1 hour ago.  I've copied the compiled binary to vmhost.

> dev@dev:~/Bugzilla_1217492> rpm -qa|grep libusb-1_0-devel
> libusb-1_0-devel-1.0.24-150400.3.3.1.x86_64

> vmhost:~/Bugzilla_1217492 # rpm -qa|grep libusb-1
> libusb-1_0-0-1.0.24-150400.3.3.1.x86_64


- The Conbee II is the only USB device attached to vmhost during the test. It was no attached to any vm and was in exclusive use by vmhost after a clean reboot.
Comment 19 Lin Ma 2024-02-06 13:24:05 UTC
(In reply to P from comment #18)
> I'm afraid the challenges are greater than expected. The output is:
> 
> > vmhost:~/Bugzilla_1217492 # ./test_libusb_get_active_config_desc 
> > LIBUSB_SUCCESS             =   0
> > LIBUSB_ERROR_IO            =  -1
> > LIBUSB_ERROR_INVALID_PARAM =  -2
> > LIBUSB_ERROR_ACCESS        =  -3
> > LIBUSB_ERROR_NO_DEVICE     =  -4
> > LIBUSB_ERROR_NOT_FOUND     =  -5
> > LIBUSB_ERROR_BUSY          =  -6
> > LIBUSB_ERROR_TIMEOUT       =  -7
> > LIBUSB_ERROR_OVERFLOW      =  -8
> > LIBUSB_ERROR_PIPE          =  -9
> > LIBUSB_ERROR_INTERRUPTED   = -10
> > LIBUSB_ERROR_NO_MEM        = -11
> > LIBUSB_ERROR_NOT_SUPPORTED = -12
> > LIBUSB_ERROR_OTHER         = -99
> > rc = 0
> > Active Configuration:
> >   Configuration Value: 1
> >   Number of Interfaces: 2
> 
> Just to confirm:
> - My vmhost is a bare minimum configuration. So I can't run development
> stuff on it. I set up a development vm specifically for this test. Installed
> from the Leap 15.5 DVD ISO, zypper update 1 hour ago.  I've copied the
> compiled binary to vmhost.
> 
> > dev@dev:~/Bugzilla_1217492> rpm -qa|grep libusb-1_0-devel
> > libusb-1_0-devel-1.0.24-150400.3.3.1.x86_64
> 
> > vmhost:~/Bugzilla_1217492 # rpm -qa|grep libusb-1
> > libusb-1_0-0-1.0.24-150400.3.3.1.x86_64
> 
> 
> - The Conbee II is the only USB device attached to vmhost during the test.
> It was no attached to any vm and was in exclusive use by vmhost after a
> clean reboot.

Emm...
Well, Hopefully the following test(s) will give us some insight.
Comment 20 Lin Ma 2024-02-06 13:24:54 UTC
Created attachment 872500 [details]
test program v2
Comment 21 Lin Ma 2024-02-06 13:27:48 UTC
1. Make sure the file /etc/qemu/bridge.conf contains below line:
vmhost:~ # cat /etc/qemu/bridge.conf
allow all

2. Get rid of libvirt, Directly launch vm USB_Test with Conbee II stick by qemu as root:
#!/bin/bash
USB_VID="1cf1"
USB_PID="0030"
IMAGE="/var/lib/libvirt/images/USB_Test.qcow2"
IMAGE_FORMAT="qcow2"
MEM_SIZE="2048"
MAC="52:54:00:80:82:48"
USB_BUS=`lsusb -d $USB_VID:$USB_PID | awk -F '[ :]' '{print $2}'`
USB_DEV=`lsusb -d $USB_VID:$USB_PID | awk -F '[ :]' '{print $4}'`
USB_HOSTDEV="/dev/bus/usb/$USB_BUS/$USB_DEV"
/usr/bin/qemu-system-x86_64 \
-name guest=USB_Test \
-blockdev '{"driver":"file","filename":"/usr/share/qemu/ovmf-x86_64-4m-code.bin","node-name":"libvirt-pflash0-storage","auto-read-only":true,"discard":"unmap"}' \
-blockdev '{"node-name":"libvirt-pflash0-format","read-only":true,"driver":"raw","file":"libvirt-pflash0-storage"}' \
-blockdev '{"driver":"file","filename":"/var/lib/libvirt/qemu/nvram/USB_Test_VARS.fd","node-name":"libvirt-pflash1-storage","auto-read-only":true,"discard":"unmap"}' \
-blockdev '{"node-name":"libvirt-pflash1-format","read-only":false,"driver":"raw","file":"libvirt-pflash1-storage"}' \
-machine pc-i440fx-7.1,usb=off,vmport=off,dump-guest-core=off,memory-backend=pc.ram,pflash0=libvirt-pflash0-format,pflash1=libvirt-pflash1-format \
-m $MEM_SIZE \
-object '{"qom-type":"memory-backend-ram","id":"pc.ram","size":2147483648}' \
-overcommit mem-lock=off \
-accel kvm \
-cpu host \
-smp 1,sockets=1,cores=1,threads=1 \
-uuid fc739108-8877-417c-8f28-ca94bce999bb \
-no-user-config \
-nodefaults \
-boot strict=on \
-drive file=$IMAGE,if=none,id=drive-virtio-disk0,format=$IMAGE_FORMAT \
-device virtio-blk-pci,scsi=off,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 \
-netdev bridge,id=hostnet0,br=br0 \
-device e1000,netdev=hostnet0,id=net0,mac=$MAC,bus=pci.0,addr=0x3 \
-vnc 127.0.0.1:1 \
-monitor stdio \
-device virtio-vga,id=video0 \
-device qemu-xhci,id=usb \
-device usb-host,hostdevice=$USB_HOSTDEV,id=hostdev0

3. On your vmhost, connect to vm USB_Test by "vncviewer :1" to observe

The expected results are:
1. The Conbee II stick can be passed-through successfully into vm USB_Test.
2. The Conbee II stick is functional in vm USB_Test.
----------------------------------------------------------------------------
If the above test passes, the below test is unnecessary.
If not, Please compile the attached test_libusb_get_active_config_desc-v2.c and run the generated binary as root.
Unlike v1, The v2 test program opens the usb device in exactly the same way that qemu opens it.
I expects error occuring while v2 tries to get the active config descriptor.

1. Check out the busID and devID by 'lsusb -d 1cf1:0030', E.g:
Bus 001 Device 003: ID 1cf1:0030 Dresden Elektronik ZigBee gateway [ConBee II]

2. ./test_libusb_get_active_config_desc-v2 001 003
Comment 22 P 2024-02-09 08:29:58 UTC
As per your instructions:
1. Done
2. Done
3. Unfortunately no changes. The Conbee II cannot be used. During startup of the vm, vmhost shows the following journal entries:

> feb 09 08:38:49 vmhost (udev-worker)[1978]: Using default interface naming scheme 'sle15-sp4'.
> feb 09 08:38:49 vmhost kernel: br0: port 2(tap0) entered blocking state
> feb 09 08:38:49 vmhost kernel: br0: port 2(tap0) entered disabled state
> feb 09 08:38:49 vmhost kernel: device tap0 entered promiscuous mode
> feb 09 08:38:49 vmhost kernel: br0: port 2(tap0) entered blocking state
> feb 09 08:38:49 vmhost kernel: br0: port 2(tap0) entered forwarding state
> feb 09 08:38:55 vmhost kernel: usb 1-3: reset full-speed USB device number 3 using xhci_hcd
> feb 09 08:38:55 vmhost kernel: cdc_acm 1-3:1.0: ttyACM0: USB ACM device
> feb 09 08:38:58 vmhost kernel: usb 1-3: usbfs: process 1983 (qemu-system-x86) did not claim interface 0 before use

Compiling the test program doesn't work. I used your compile instructions from your earlier comment and replaced the filenames. The following error messages are displayed:

> dev@dev:~/src/Bugzilla_1217492> gcc test_libusb_get_active_config_desc-v2.c -o test_libusb_get_active_config_desc-v2
> /usr/lib64/gcc/x86_64-suse-linux/7/../../../../x86_64-suse-linux/bin/ld: > /tmp/ccmLNxK5.o: in function `main':
> test_libusb_get_active_config_desc-v2.c:(.text+0x233): undefined reference to `libusb_init'
> /usr/lib64/gcc/x86_64-suse-linux/7/../../../../x86_64-suse-linux/bin/ld: test_libusb_get_active_config_desc-v2.c:(.text+0x282): undefined reference to `libusb_set_option'
> /usr/lib64/gcc/x86_64-suse-linux/7/../../../../x86_64-suse-linux/bin/ld: test_libusb_get_active_config_desc-v2.c:(.text+0x2e5): undefined reference to `libusb_wrap_sys_device'
> /usr/lib64/gcc/x86_64-suse-linux/7/../../../../x86_64-suse-linux/bin/ld: test_libusb_get_active_config_desc-v2.c:(.text+0x31d): undefined reference to `libusb_get_device'
> /usr/lib64/gcc/x86_64-suse-linux/7/../../../../x86_64-suse-linux/bin/ld: > > > test_libusb_get_active_config_desc-v2.c:(.text+0x32d): undefined reference to `libusb_get_bus_number'
> /usr/lib64/gcc/x86_64-suse-linux/7/../../../../x86_64-suse-linux/bin/ld: test_libusb_get_active_config_desc-v2.c:(.text+0x33f): undefined reference to `libusb_get_device_address'
> /usr/lib64/gcc/x86_64-suse-linux/7/../../../../x86_64-suse-linux/bin/ld: test_libusb_get_active_config_desc-v2.c:(.text+0x372): undefined reference to `libusb_get_active_config_descriptor'
> /usr/lib64/gcc/x86_64-suse-linux/7/../../../../x86_64-suse-linux/bin/ld: test_libusb_get_active_config_desc-v2.c:(.text+0x3fd): undefined reference to `libusb_free_config_descriptor'
> /usr/lib64/gcc/x86_64-suse-linux/7/../../../../x86_64-suse-linux/bin/ld: test_libusb_get_active_config_desc-v2.c:(.text+0x409): undefined reference to `libusb_reset_device'
> /usr/lib64/gcc/x86_64-suse-linux/7/../../../../x86_64-suse-linux/bin/ld: test_libusb_get_active_config_desc-v2.c:(.text+0x415): undefined reference to `libusb_close'
> /usr/lib64/gcc/x86_64-suse-linux/7/../../../../x86_64-suse-linux/bin/ld: test_libusb_get_active_config_desc-v2.c:(.text+0x424): undefined reference to `libusb_exit'
> collect2: error: ld returned 1 exit status
Comment 23 Lin Ma 2024-02-10 15:45:58 UTC
(In reply to P from comment #22)
> As per your instructions:
> 1. Done
> 2. Done
> 3. Unfortunately no changes. The Conbee II cannot be used. During startup of
> the vm, vmhost shows the following journal entries:
> 
> > feb 09 08:38:49 vmhost (udev-worker)[1978]: Using default interface naming scheme 'sle15-sp4'.
> > feb 09 08:38:49 vmhost kernel: br0: port 2(tap0) entered blocking state
> > feb 09 08:38:49 vmhost kernel: br0: port 2(tap0) entered disabled state
> > feb 09 08:38:49 vmhost kernel: device tap0 entered promiscuous mode
> > feb 09 08:38:49 vmhost kernel: br0: port 2(tap0) entered blocking state
> > feb 09 08:38:49 vmhost kernel: br0: port 2(tap0) entered forwarding state
> > feb 09 08:38:55 vmhost kernel: usb 1-3: reset full-speed USB device number 3 using xhci_hcd
> > feb 09 08:38:55 vmhost kernel: cdc_acm 1-3:1.0: ttyACM0: USB ACM device
> > feb 09 08:38:58 vmhost kernel: usb 1-3: usbfs: process 1983 (qemu-system-x86) did not claim interface 0 before use

ok, got it.

> 
> Compiling the test program doesn't work. I used your compile instructions
> from your earlier comment and replaced the filenames. The following error
> messages are displayed:
> 
> > dev@dev:~/src/Bugzilla_1217492> gcc test_libusb_get_active_config_desc-v2.c -o test_libusb_get_active_config_desc-v2
> > /usr/lib64/gcc/x86_64-suse-linux/7/../../../../x86_64-suse-linux/bin/ld: > /tmp/ccmLNxK5.o: in function `main':
> > test_libusb_get_active_config_desc-v2.c:(.text+0x233): undefined reference to `libusb_init'
> > /usr/lib64/gcc/x86_64-suse-linux/7/../../../../x86_64-suse-linux/bin/ld: test_libusb_get_active_config_desc-v2.c:(.text+0x282): undefined reference to `libusb_set_option'
> > /usr/lib64/gcc/x86_64-suse-linux/7/../../../../x86_64-suse-linux/bin/ld: test_libusb_get_active_config_desc-v2.c:(.text+0x2e5): undefined reference to `libusb_wrap_sys_device'
> > /usr/lib64/gcc/x86_64-suse-linux/7/../../../../x86_64-suse-linux/bin/ld: test_libusb_get_active_config_desc-v2.c:(.text+0x31d): undefined reference to `libusb_get_device'
> > /usr/lib64/gcc/x86_64-suse-linux/7/../../../../x86_64-suse-linux/bin/ld: > > > test_libusb_get_active_config_desc-v2.c:(.text+0x32d): undefined reference to `libusb_get_bus_number'
> > /usr/lib64/gcc/x86_64-suse-linux/7/../../../../x86_64-suse-linux/bin/ld: test_libusb_get_active_config_desc-v2.c:(.text+0x33f): undefined reference to `libusb_get_device_address'
> > /usr/lib64/gcc/x86_64-suse-linux/7/../../../../x86_64-suse-linux/bin/ld: test_libusb_get_active_config_desc-v2.c:(.text+0x372): undefined reference to `libusb_get_active_config_descriptor'
> > /usr/lib64/gcc/x86_64-suse-linux/7/../../../../x86_64-suse-linux/bin/ld: test_libusb_get_active_config_desc-v2.c:(.text+0x3fd): undefined reference to `libusb_free_config_descriptor'
> > /usr/lib64/gcc/x86_64-suse-linux/7/../../../../x86_64-suse-linux/bin/ld: test_libusb_get_active_config_desc-v2.c:(.text+0x409): undefined reference to `libusb_reset_device'
> > /usr/lib64/gcc/x86_64-suse-linux/7/../../../../x86_64-suse-linux/bin/ld: test_libusb_get_active_config_desc-v2.c:(.text+0x415): undefined reference to `libusb_close'
> > /usr/lib64/gcc/x86_64-suse-linux/7/../../../../x86_64-suse-linux/bin/ld: test_libusb_get_active_config_desc-v2.c:(.text+0x424): undefined reference to `libusb_exit'
> > collect2: error: ld returned 1 exit status

It seems you forgot the '-lusb-1.0' in the gcc command line. It should be:
gcc test_libusb_get_active_config_desc-v2.c \
-o test_libusb_get_active_config_desc-v2 -lusb-1.0
Comment 24 P 2024-02-10 16:19:31 UTC
Thanks for the additional help. The output is as follows:

> vmhost:~/Bugzilla_1217492 # lsusb -d 1cf1:0030
> Bus 001 Device 003: ID 1cf1:0030 Dresden Elektronik ZigBee gateway [ConBee II]
> vmhost:~/Bugzilla_1217492 # ./test_libusb_get_active_config_desc-v2 001 003
> hostfd: 7, Bus_num: 1, Addr: 3
> libusb: error [op_get_active_config_descriptor] device unconfigured
> rc = -5
> Error getting active configuration descriptor
Comment 25 Lin Ma 2024-02-18 13:46:24 UTC
(In reply to P from comment #24)
> Thanks for the additional help. The output is as follows:
> 
> > vmhost:~/Bugzilla_1217492 # lsusb -d 1cf1:0030
> > Bus 001 Device 003: ID 1cf1:0030 Dresden Elektronik ZigBee gateway [ConBee II]
> > vmhost:~/Bugzilla_1217492 # ./test_libusb_get_active_config_desc-v2 001 003
> > hostfd: 7, Bus_num: 1, Addr: 3
> > libusb: error [op_get_active_config_descriptor] device unconfigured
> > rc = -5
> > Error getting active configuration descriptor

Thanks!
This result is expected, It's very helpful.

The libusb introduces libusb_wrap_sys_device() API since v1.0.23.
Since leap 15.4, A qemu/kvm linux guest that managed by libvirt uses libusb_wrap_sys_device() to access usb host device instead of the traditional way.
I'm guessing this new API doesn't work with your Conbee II stick.

To further confirm that libusb_wrap_sys_device() triggers this issue, Could you please help to run following test?
In this test, we fallback to use the traditional way(the way in leap 15.2 vmhost) to access your usb host device.

1. Make sure the file /etc/qemu/bridge.conf contains below line:
vmhost:~ # cat /etc/qemu/bridge.conf
allow all

2. Get rid of libvirt, Directly launch vm USB_Test with Conbee II stick by qemu as root:
#!/bin/bash
USB_VID="1cf1"
USB_PID="0030"
IMAGE="/var/lib/libvirt/images/USB_Test.qcow2"
IMAGE_FORMAT="qcow2"
MEM_SIZE="2048"
MAC="52:54:00:80:82:48"
USB_BUS=`lsusb -d $USB_VID:$USB_PID | awk -F '[ :]' '{print $2}'`
USB_DEV=`lsusb -d $USB_VID:$USB_PID | awk -F '[ :]' '{print $4}'`
/usr/bin/qemu-system-x86_64 \
-name guest=USB_Test \
-blockdev '{"driver":"file","filename":"/usr/share/qemu/ovmf-x86_64-4m-code.bin","node-name":"libvirt-pflash0-storage","auto-read-only":true,"discard":"unmap"}' \
-blockdev '{"node-name":"libvirt-pflash0-format","read-only":true,"driver":"raw","file":"libvirt-pflash0-storage"}' \
-blockdev '{"driver":"file","filename":"/var/lib/libvirt/qemu/nvram/USB_Test_VARS.fd","node-name":"libvirt-pflash1-storage","auto-read-only":true,"discard":"unmap"}' \
-blockdev '{"node-name":"libvirt-pflash1-format","read-only":false,"driver":"raw","file":"libvirt-pflash1-storage"}' \
-machine pc-i440fx-7.1,usb=off,vmport=off,dump-guest-core=off,memory-backend=pc.ram,pflash0=libvirt-pflash0-format,pflash1=libvirt-pflash1-format \
-m $MEM_SIZE \
-object '{"qom-type":"memory-backend-ram","id":"pc.ram","size":2147483648}' \
-overcommit mem-lock=off \
-accel kvm \
-cpu host \
-smp 1,sockets=1,cores=1,threads=1 \
-uuid fc739108-8877-417c-8f28-ca94bce999bb \
-no-user-config \
-nodefaults \
-boot strict=on \
-drive file=$IMAGE,if=none,id=drive-virtio-disk0,format=$IMAGE_FORMAT \
-device virtio-blk-pci,scsi=off,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 \
-netdev bridge,id=hostnet0,br=br0 \
-device e1000,netdev=hostnet0,id=net0,mac=$MAC,bus=pci.0,addr=0x3 \
-vnc 127.0.0.1:1 \
-monitor stdio \
-device virtio-vga,id=video0 \
-device qemu-xhci,id=usb \
-device usb-host,hostbus=$USB_BUS,hostaddr=$USB_DEV,id=hostdev0

3. On your vmhost, connect to vm USB_Test by "vncviewer :1" to observe

The expected results are:
1. The Conbee II stick can be passed-through successfully into vm USB_Test.
2. The Conbee II stick is functional in vm USB_Test.
Comment 26 P 2024-02-21 18:44:59 UTC
I'm very happy to report that with this modified startup sequence the Conbee II works as expected. Additionally there are no log entries at all on vmhost, with the only exception of:

> feb 21 19:25:12 vmhost kernel: usb 1-3: reset full-speed USB device number 3 using xhci_hcd

You've worked some magic here, Lin Ma. Thank you so much for your efforts digging into this problem and find the root cause. Very much appreciated.
Comment 27 Lin Ma 2024-02-29 03:19:22 UTC
:-) My pleasure.

The repo https://download.opensuse.org/repositories/hardware/15.5/ contains the latest experimental libusb packages.

Would you be willing to upgrade libusb (1.0.24 -> 1.0.27) on your vmhost to try and see if this compatibility issue is fixed?
Note: It may have unknown risks.
Comment 28 P 2024-03-02 19:05:13 UTC
Unfortunately no luck when upgrading to libusb-1_0-0-1.0.27-lp155.74.1.x86_64
The system is absolutely trashed with
> mrt 02 19:54:32 vmhost kernel: usb 1-3: usbfs: process 1664 (qemu-system-x86) did not claim interface 1 before use
 (40-50 entries per second).

In advance I did a full zypper update to get the latest, but I didn't check if you had kernel updates or libvirt/qemu updates in your repo.
Comment 29 Lin Ma 2024-03-08 15:39:55 UTC
I digged into libusb in past days.
If the target usb device is passed to libusb by vendor id and product id(the legacy way), Libusb obtains the device information by reading sysfs.
If the target usb device is passed to libusb by a fd(the modern way), Libusb obtains the device information by sending a ctrl message to the device.

I suspect the cause of the issue is:
On your vmhost, When libusb sends a ctrl message to the Conbee II, the Conbee II doesn't give a proper response for some reason (e.g. timeout, the timeout value of ctrl transfer is 1000ms in libusb), causing libusb to fail to get the active config descriptor from the device.
To verify this guess, Could you please run the following test?

Steps:
1. Make sure your have a clean host environment, In other words, make sure the issue can be reproduced and the libusb version is 1.0.24.

2. Install package systemtap on the vmhost.

3. Install package libusb-1_0-0-debuginfo from the debug Repository on the vmhost.
E.g:
vmhost:~ # rpm -qa | grep libusb-1_0-0
libusb-1_0-0-debuginfo-1.0.24-150400.3.3.1.x86_64
libusb-1_0-0-1.0.24-150400.3.3.1.x86_64


4. Make sure the trace points(usbfs_get_active_config and op_get_active_config_descriptor) are available.
E.g:
vmhost:~ # stap -L 'process("/usr/lib64/libusb-1.0.so.0").function("usbfs_get_active_config"), process("/usr/lib64/libusb-1.0.so.0").function("op_get_active_config_descriptor")'
process("/usr/lib64/libusb-1.0.so.0.3.0").function("op_get_active_config_descriptor@os/linux_usbfs.c:783") $dev:struct libusb_device* $buffer:void* $len:size_t $active_config:uint8_t
process("/usr/lib64/libusb-1.0.so.0.3.0").function("usbfs_get_active_config@os/linux_usbfs.c:830") $dev:struct libusb_device* $fd:int $active_config:uint8_t $ctrl:struct usbfs_ctrltransfer $__func__:char const[] const


5. In terminal A: vmhost:~ # stap -v ./script-v1.stp  //waiting for code compiling and running.

6. In terminal B: vmhost:~ # ./test_libusb_get_active_config_desc-v2 001 003

7. Feedback the output from terminal A and terminal B.


Lin
Comment 30 Lin Ma 2024-03-09 03:03:53 UTC
Created attachment 873363 [details]
The systemtap script v1 for debug
Comment 31 P 2024-03-09 10:12:37 UTC
I've run the scripts and attached the output.
Comment 32 P 2024-03-09 10:13:44 UTC
Created attachment 873372 [details]
script-v1.stp.log
Comment 33 P 2024-03-09 10:14:13 UTC
Created attachment 873373 [details]
test_libusb_get_active_config_desc-v2.log
Comment 34 Lin Ma 2024-03-10 08:21:53 UTC
Thanks for your testing!

According to the output of stap, We can see the value of active_config is 0 and the value of r is 1 after invoking function usbfs_get_active_config.
It means the Conbee II reports 0 as the result during querying its bConfigurationValue in a ctrl message through ioctl from linux userspace.
The value 0 is incorrect, or at least strange because its actual value is 1 according to comment#15.

We know that the ioctl calls are typically used for interaction with userspace.
While querying device configuration, Libusb obtains them by reading sysfs in case of legacy way, by sending ioctl in case of modern way.
Why the value(1) in sysfs is correct, but libusb got the wrong value(0) here?
Perhaps because value 1 is obtained by USB transport layer in kernelspace rather than using ioctl in userspace.

I don't know what's the USB controller used by Conbee II, I didn't findout the datasheet of Conbee II either in google or manufacturer web site(dresden elektronik). So can't confirm whether its USB controller has certain limitations/defects.

I can only make the following guesses:
Case A: the USB controller used by Conbee II can't correctly response the ctrl message that coming from ioctl in userspace.
Case B: the USB controller used by Conbee II can't correctly response multiple kinds of requests that coming from ioctl in userspace.



I submitted a temporary patch to try to workaround this issue by hardcodeing bConfigurationValue to 1 for Conbee II Zigbee stick.
If you'd like, please force install libusb via below repo then run two tests:
https://download.opensuse.org/repositories/home:/lin_ma:/branches:/openSUSE:/Leap:/15.5/pool-leap-15.5/

Test 1. vmhost:~ # env LIBUSB_DEBUG=3 ./test_libusb_get_active_config_desc-v2 001 003
Expected:
* No any warning messages occur.
* The below error message won't occur:
    libusb: error [op_get_active_config_descriptor] device unconfigured
    rc = -5
  Instead, It should look like:
    hostfd: 7, Bus_num: 1, Addr: 3
    rc = 0
    Active Configuration:
      Configuration Value: 1
      Number of Interfaces: 2


Test 2. launching the vm USB_Test by libvirt(using conbee II as a usb hostdev rather than as a serial device).
If the Conbee II can be passed-through successfully and is functional in vm USB_Test, It means we hit case A, Otherwise we hit case B.
Even if the temporary patch can workaround this issue, I'm afraid we won't carry such a hardcoded patch for a specific device.

Conclusion:
Conbee II is incompatible with the modern usage(ioctl) of libusb. Hope Conbee III or newer doesn't have this issue.
Comment 35 P 2024-03-10 11:30:41 UTC
It must be case A. Both tests run fine:

> vmhost:~/Bugzilla_1217492 # env LIBUSB_DEBUG=3 ./test_libusb_get_active_config_desc-v2 001 003
> hostfd: 7, Bus_num: 1, Addr: 3
> rc = 0
> Active Configuration:
>   Configuration Value: 1
>   Number of Interfaces: 2

The vm also takes the Conbee II as a pass through device, without any error messages in the system log of vmhost. Your diagnosis seems to be spot on (sadly for me).

I'll file a report with Dresden Electronik, referencing this bug. Hopefully it leads to a resolution.
Comment 36 Lin Ma 2024-03-10 14:31:21 UTC
(In reply to P from comment #35)
> It must be case A. Both tests run fine:
> 
> > vmhost:~/Bugzilla_1217492 # env LIBUSB_DEBUG=3 ./test_libusb_get_active_config_desc-v2 001 003
> > hostfd: 7, Bus_num: 1, Addr: 3
> > rc = 0
> > Active Configuration:
> >   Configuration Value: 1
> >   Number of Interfaces: 2
> 
> The vm also takes the Conbee II as a pass through device, without any error
> messages in the system log of vmhost. Your diagnosis seems to be spot on
> (sadly for me).
> 
> I'll file a report with Dresden Electronik, referencing this bug. Hopefully
> it leads to a resolution.

OK.
The following is the relevant code snippet on how libusb sends control messages, FYI.
The code comes from git master of upstream libusb.

libusb/os/linux_usbfs.c:
/* send a control message to retrieve active configuration */
static int usbfs_get_active_config(struct libusb_device *dev, int fd)
{
        struct linux_device_priv *priv = usbi_get_device_priv(dev);
        uint8_t active_config = 0;
        int r;

        struct usbfs_ctrltransfer ctrl = {
                .bmRequestType = LIBUSB_ENDPOINT_IN,
                .bRequest = LIBUSB_REQUEST_GET_CONFIGURATION,
                .wValue = 0,
                .wIndex = 0,
                .wLength = 1,
                .timeout = 1000,
                .data = &active_config
        };

        r = ioctl(fd, IOCTL_USBFS_CONTROL, &ctrl);
        if (r < 0) {
                if (errno == ENODEV)
                        return LIBUSB_ERROR_NO_DEVICE;

                /* we hit this error path frequently with buggy devices :( */
                usbi_warn(DEVICE_CTX(dev), "get configuration failed, errno=%d", errno);

                /* assume the current configuration is the first one if we have
                 * the configuration descriptors, otherwise treat the device
                 * as unconfigured. */
                if (priv->config_descriptors)
                        priv->active_config = (int)priv->config_descriptors[0].desc->bConfigurationValue;
                else
                        priv->active_config = -1;
        } else if (active_config == 0) {
                if (dev_has_config0(dev)) {
                        /* some buggy devices have a configuration 0, but we're
                         * reaching into the corner of a corner case here. */
                        priv->active_config = 0;
                } else {
                        priv->active_config = -1;  <-- The conbee II hits here.
                }
        } else {
                priv->active_config = (int)active_config;
        }

        return LIBUSB_SUCCESS;
}


Good luck!