Bug 1214703

Summary: NVME qemu commandline fails with PCI: slot 2 function 0 not available for pcie-root-port, in use by nvme,id=(null)
Product: [openSUSE] openSUSE Distribution Reporter: Radovan Varga <radovan.varga>
Component: Virtualization:ToolsAssignee: E-mail List <kvm-bugs>
Status: NEW --- QA Contact: E-mail List <qa-bugs>
Severity: Normal    
Priority: P3 - Medium CC: claudio.fontana, jfehlig, radovan.varga
Version: Leap 15.5   
Target Milestone: ---   
Hardware: x86-64   
OS: openSUSE Leap 15.5   
Whiteboard:
Found By: Community User Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: debug logs from script with various libvirt versions, qemu/libvirt logs
Test VM deployed with addr=04.x for nvme drives

Description Radovan Varga 2023-08-28 14:48:21 UTC
Created attachment 869052 [details]
debug logs from script with various libvirt versions, qemu/libvirt logs

On OpenSUSE Leap 15.5 my script deploying VMs with nvme emulated drives stopped to work. Previously on 15.4 it worked without any issues, now it gives an error like this:

libvirt.libvirtError: internal error: qemu unexpectedly closed the monitor: 2023-08-28T14:26:07.997521Z qemu-system-x86_64: -device {"driver":"pcie-root-port","port":16,"chassis":1,"id":"pci.1","bus":"pcie.0","multifunction":true,"addr":"0x2"}: PCI: slot 2 function 0 not available for pcie-root-port, in use by nvme,id=(null)


How to reproduce:

1. Create qcow2 image files for nvme disks:

qemu-img create -f qcow2 /var/lib/libvirt/images/nvme1.qcow2 1G
...

2. Change ownership of files to qemu:qemu

chown qemu:qemu /var/lib/libvirt/images/nvme1.qcow2
...

3. Start virt-inst with the following settings:

virt-install --location \ 
  "http://download.opensuse.org/pub/opensuse/distribution/leap/15.0/repo/oss" \
  --name "testnvme" --memory 2048 --virt-type kvm \
  --connect qemu:///system \
  --disk size=10  --network default \
  '--qemu-commandline=-drive file=/var/lib/libvirt/images/nvme1.qcow2,format=qcow2,if=none,id=NVME1' \
  '--qemu-commandline=-drive file=/var/lib/libvirt/images/nvme2.qcow2,format=qcow2,if=none,id=NVME2' \
  '--qemu-commandline=-device nvme,drive=NVME1,serial=0001' \
  '--qemu-commandline=-device nvme,drive=NVME2,serial=0002' --debug

Attached is debug log from attempts with libvirt 8.0 (downgraded on 15.5) - works, 9.0 and 9.6 - doesn't work. Also, libvirt/qemu log files are attached
Comment 1 James Fehlig 2023-08-28 21:54:44 UTC
(In reply to Radovan Varga from comment #0)
> On OpenSUSE Leap 15.5 my script deploying VMs with nvme emulated drives
> stopped to work. Previously on 15.4 it worked without any issues, now it
> gives an error like this:
> 
> libvirt.libvirtError: internal error: qemu unexpectedly closed the monitor:
> 2023-08-28T14:26:07.997521Z qemu-system-x86_64: -device
> {"driver":"pcie-root-port","port":16,"chassis":1,"id":"pci.1","bus":"pcie.0",
> "multifunction":true,"addr":"0x2"}: PCI: slot 2 function 0 not available for
> pcie-root-port, in use by nvme,id=(null)

Using command line passthrough is unsupported and there are no guarantees it will continue to work as is when updating libvirt, qemu, or both

https://libvirt.org/drvqemu.html#pass-through-of-arbitrary-qemu-commands

> virt-install --location \ 
>  
> "http://download.opensuse.org/pub/opensuse/distribution/leap/15.0/repo/oss" \
>   --name "testnvme" --memory 2048 --virt-type kvm \
>   --connect qemu:///system \
>   --disk size=10  --network default \
>   '--qemu-commandline=-drive
> file=/var/lib/libvirt/images/nvme1.qcow2,format=qcow2,if=none,id=NVME1' \
>   '--qemu-commandline=-drive
> file=/var/lib/libvirt/images/nvme2.qcow2,format=qcow2,if=none,id=NVME2' \
>   '--qemu-commandline=-device nvme,drive=NVME1,serial=0001' \
>   '--qemu-commandline=-device nvme,drive=NVME2,serial=0002' --debug

libvirt creates a pci-root controller with 31 slots. Those don't support later hotplugging of PCI devices, so libvirt also creates some pci-root-port controllers to support hotplugging. You'll need to find an "unused" slot and function on the pci-root controller for the nvme device. You can see which slots and functions libvirt uses by peeking at the "XML fetched from libvirt object" in /root/.cache/virt-manager/virt-install.log.

I reproduced your issue on latest TW using only one nvme. In my case, I could see in /root/.cache/virt-manager/virt-install.log that the last pci-root-port controller was added to slot 3, function 5 of the pci-root controller. Adding the nvme device to slot 3, function 6 worked for me. E.g.

virt-install --location \
"http://download.opensuse.org/pub/opensuse/distribution/leap/15.0/repo/oss" \
--name "testnvme" --memory 2048 --virt-type kvm \
--connect qemu:///system \
--disk size=10  --network default \
'--qemu-commandline=-drive file=/vm_images/jim/images/test/disk.qcow2,format=qcow2,if=none,id=NVME1' \
'--qemu-commandline=-device nvme,drive=NVME1,serial=0001,addr=03.6'

Does specifying a free slot and function on the pci-root controller using the 'addr' option to the nvme device work for you?
Comment 2 James Fehlig 2023-08-28 22:24:03 UTC
FTR, some notes from a conversation on slack:

Adding nvme emulation support to libvirt was discussed back in Nov 2020

https://listman.redhat.com/archives/libvir-list/2020-November/211354.html

It's a long thread, and in the end there was agreement on a design for specifying nvme controller and namespaces. However, there were never any followup patches from Nutanix. There was another hacky attempt in May 2021

https://listman.redhat.com/archives/libvir-list/2021-May/219342.html

It was rejected in favor of the previously agreed design. But again, no followup work.
Comment 3 Radovan Varga 2023-08-29 08:17:22 UTC
Thank you James for a very fast response. Indeed with the suggested notation, I am able to continue even with multiple NVE devices (function value needs to increment). 

Is the slot number always 3 or it may eventually change? 

Also, I guess you mean "pcie-root-port" and not "pci-root-port".
Comment 4 Claudio Fontana 2023-08-29 08:37:40 UTC
(In reply to Radovan Varga from comment #3)
> Thank you James for a very fast response. Indeed with the suggested
> notation, I am able to continue even with multiple NVE devices (function
> value needs to increment). 
> 
> Is the slot number always 3 or it may eventually change? 

It may change.

As Jim mentioned before, relying on injection of qemu command lines gives you no guarantees for backward compatibility of your scripts.

The QEMU command line options can change in name, formatting etc, or they can disappear entirely. This is why it is so much better to use the features of virt-install / libvirt if at all possible, as the tools deal with these compatibility issues for you.

Is this script part of some product deployment step?

If specifically nvme emulation is necessary, it would be better to revive the discussion upstream for libvirt support, but is there a reason why emulating an nvme device in the guest is necessary, as opposed to using a virtio device like virtio-blk or virtio-scsi?


> 
> Also, I guess you mean "pcie-root-port" and not "pci-root-port".
Comment 6 Radovan Varga 2023-08-29 10:15:41 UTC
With the suggested command I managed to deploy the VM but I do not see any nvme disks there. Due to lack of time, I cannot investigate it further at the moment but you may want to verify whether the command that convinces virt-install to continue ends up deploying a VM with a nvme disk.
Comment 7 Radovan Varga 2023-08-29 10:20:23 UTC
Created attachment 869082 [details]
Test VM deployed with addr=04.x for nvme drives

The deployment went through but VM doesn't show any nvme devices
Comment 8 Radovan Varga 2023-08-30 07:16:29 UTC
I did a few tests today and when using addr=03.x, I see the nvme devices. With addr=04.1 and above I do not see them. What am I missing?

The goal is to be able to have more than two nvme devices, which is currently the limit with bus 03 as I can only use functions 6 and 7.
Comment 9 James Fehlig 2023-09-11 22:46:01 UTC
(In reply to Radovan Varga from comment #8)
> I did a few tests today and when using addr=03.x, I see the nvme devices.
> With addr=04.1 and above I do not see them. What am I missing?

AFAIK, you can only plug single function devices into qemu's root bus (pcie.0). You would need to use addr=04.0, addr=05.0, addr=06.0, etc.

Alternatively, you can add another multi-function pci-root-port to slot 4 of the root bus, then add your nvme devices to addr=04.1, addr=04.2, addr=04.3, etc. See the qemu and libvirt PCI docs for more info

https://github.com/qemu/qemu/blob/master/docs/pcie.txt
https://libvirt.org/pci-hotplug.html