Bugzilla – Bug 1219590
NVIDIA 545.29.06 driver Vulkan ICD loader regression
Last modified: 2024-03-04 13:50:54 UTC
Created attachment 872479 [details] Some basic PC hardware info NVIDIA 545.29.06 driver from nvidia repo (openSUSE recommended installation way) makes GPU utilization in wine OpenGL apps impossible. Every try ends with error. Perhaps, optimus laptops affected only. ------------------- Steps to reproduce: 1) Use any wine launcher capable of handling discrete GPU or passing environment variable in wine app (Lutris, Bottles); 2) Toggle use of discrete gpu (PRIME Render Offload if possible) via launcher GUI or by passing environment (most wine launcher save variables, perhaps, in different combinations) PRIME Render Offload variables according to documentation "__NV_PRIME_RENDER_OFFLOAD=1 __GLX_VENDOR_LIBRARY_NAME=nvidia __VK_LAYER_NV_optimus=NVIDIA_only"; 2) Try to launch wine app that uses heavy 3D rendering (OpenGL, for example). ------------------- Result: Wine crashes with error similar to this: X Error of failed request: BadMatch (invalid parameter attributes) Major opcode of failed request: 156 (NV-GLX) Minor opcode of failed request: 43 () Serial number of failed request: 414 Current serial number in output stream: 415 ------------------- Expected result: Wine app works as intended. 3D graphics rendered without errors leading to crash or inability to launch. ------------------- Workaround: Disable use of discrete GPU in wine launcher or don't pass "__NV_PRIME_RENDER_OFFLOAD=1 __GLX_VENDOR_LIBRARY_NAME=nvidia __VK_LAYER_NV_optimus=NVIDIA_only" to wine. Pass this variable to wine "VK_ICD_FILENAMES=/etc/vulkan/icd.d/nvidia_icd.json". ------------------- Workaround isn't ideal. In X11 Plasma session apps launch and function fine for a short time. Then whole PC freezes what forces to reset power supply to shutdown system. Workaround works fine in Wayland Plasma session. ------------------- Native linux apps seems unaffected by this bug. ------------------- Links: https://github.com/bottlesdevs/Bottles/issues/3078 https://forums.developer.nvidia.com/t/545-driver-doesnt-work-on-optimus/280333 Mostly experiments to find source of problem. ------------------- Fuss/personal opinion: I've tried different distributions. Kernel everywhere not older ~6.6.7. So all of them are quite updated. Problem persists across all of them. And everywhere different workarounds or lack of them. Tried flatpak also. That's for another story due to nvidia additional runtime use to make them work. What I've known so far to make any assumptions: linux kernel maintainers closed opportunity for nvidia to use some symbols in kernel supposed for open source drivers, developers through official nvidia forum don't know source of problem and don't answer any topics regarding that bug according to what one of most active forum members said, I've tried also different driver versions and installation ways (,run) problem persists, tried Leap too with older kernel (you should know better what you backport there because Leap had no success therefore affected by bug too). So mainly problem lies in driver itself and vulkan ICD loader detection in my opinion. ------------------- So what information I expect to find here: 1) Is it possible to debug driver and find source of problem? 2) Should it be discussed here or no point since there is official forum and driver is proprietary? (Still think people should know where to read information regarding bug, because I'm struggling with it for 2 months and had almost 0 information)
Not sure what's wrong here ... # rpm -qpl nvidia-gl-G06-545.29.06-19.3.x86_64.rpm | grep vulkan [...] /etc/vulkan /etc/vulkan/icd.d /etc/vulkan/icd.d/nvidia_icd.json /etc/vulkan/implicit_layer.d /etc/vulkan/implicit_layer.d/nvidia_layers.json # cat NVIDIA-Linux-x86_64-545.29.06/README.txt [...] The Vulkan ICD configuration file is installed as '/etc/vulkan/icd.d/nvidia_icd.json'. An additional Vulkan layer configuration file is installed as '/etc/vulkan/implicit_layer.d/nvidia_layers.json'. These layers add functionality to the Vulkan loader. [...] Don't know why "$VK_ICD_FILENAMES needs to be set ...
Maybe this is somewhat related to Wine. Adding our Wine expert Marcus Meissner. But things don't run stable anyway. :-(
I did few tests on openSUSE TW and Fedora 39. Results are mostly same. Used KDE Plasma with last updates on every distribution. x11 session requires VK_ICD_FILENAMES to be set. And no use of __NV_PRIME_RENDER_OFFLOAD=1 __GLX_VENDOR_LIBRARY_NAME=nvidia __VK_LAYER_NV_optimus=NVIDIA_only. Wayland session works fine in both situation told in previous sentence, meaning it works as intended. The bug can be localized up to x11 session only. And from what I saw optimus laptops only. x11 kinda stopped freezing (I don't know why because it was so random).
Meanwhile 550.54.14 is available. Maybe with that version things are changing.
For me bug persists in 550.
Thanks for checking! I'm sorry to hear it doesn't help.
(no idea on the wine side. not familar with the new vulkan stuff ... i did some greps, but no hit.)
On second thought, I've performed clean installation of openSUSE TW and checked again in x11 with nvidia 550.54.14. --- 1 test case Environment: Steam as rpm Game https://www.protondb.com/app/39210 Wine GE-Proton8-32 (which might by this time backported some features from wine 9+, later explained why I mention this) __NV_PRIME_RENDER_OFFLOAD=1 __GLX_VENDOR_LIBRARY_NAME=nvidia __VK_LAYER_NV_optimus=NVIDIA_only gamemoderun %command% Result: Success. nvidia dGPU was being utilized as should be. --- 2 test case: Environment: Bottles as rpm Game https://eu.shop.battle.net/ battle.net client (should be rendered with dGPU as well) through internal bottles installation script Wine wine-ge-proton8-26 __NV_PRIME_RENDER_OFFLOAD=1 __GLX_VENDOR_LIBRARY_NAME=nvidia __VK_LAYER_NV_optimus=NVIDIA_only Result: Failure. Intel iGPU was used. Then I guessed that problem might be in wine itself. For second test case I tried other wine runners. Last successful was kron4ek-wine-proton-exp-9.0-amd64. Result 2: Success. nvidia utilized. --- So my guess is regression was in wine 8 and lower (? for some reason). And hopefully was fixed in 9.0+ and some custom layers with backport patches. Since it's somewhat to do with compatibility layer I believe it's out of scope of openSUSE problems. Feel free to share thoughts and/or close report if you see fit.
Ok. So let's close as fixed.