Bugzilla – Bug 1212345
Sway fails to start after Mesa update to 23.1.x
Last modified: 2024-01-05 18:00:13 UTC
After yesterday's updates to Tumbleweed snapshot 20230612, Sway now fails to start with the following error: ``` [wlr] [EGL] command: elgCreateContext, error: EGL_BAD_ALLOC (0x3003), message: "dri2_create_context" [wlr] [render/egl.c:409] Failed to create EGL context [wlr] [renger/egl.c:554] Failed to initialize EGL context [wlr] [render/gles2/renderer.c:679] Could not initialize EGL [wlr] [render/wlr_renderer.c:333] Could not initialize renderer [sway/server.c:79] Failed to create renderer ``` One possible culprit from the updated packages is `Mesa-dri`.
It looks like it might have been caused by Mesa updates 23.1.x (https://build.opensuse.org/request/show/1092024)
Adding Joan. He may know more about wlroots...
So this is Intel GPU.
Thanks, let me restate here what I wrote on Slack. This problem was introduced in snapshot 20230612 and did not exist in 20230610. I am going to attach the list of packages that was updated on my machine to narrow it down. I first observed the issue on my dev laptop (Intel + nouveau NVIDIA), but I am able to reproduce it in a VM with only the Intel card passed through for OpenGL.
Created attachment 867562 [details] Package changes on my machine between snapshots 20230610 to 20230612
Someone on IRC has mentioned that this has also been seen on a amd radeon rx 5700
Today, I ran sway on an up-to-date TW with software rendering. That worked. You may want to boot the kernel with the nodemodeset parameter. That will disable any hardware graphics rendering.
Good idea, Thomas!
I can confirm that adding `nomodeset` to the kernel command line allows sway to start. I tried this on the VM with Intel mentioned above. A side effect is that the output display is now detected as 'Unknown-1' with the only supported resolution being 640x480.
(In reply to Thomas Zimmermann from comment #7) > Today, I ran sway on an up-to-date TW with software rendering. That worked. > > You may want to boot the kernel with the nodemodeset parameter. That will > disable any hardware graphics rendering. That makes sway use the pixman renderer. The issue still is why fails when using the EGL renderer. The failure happens here: https://gitlab.freedesktop.org/mesa/mesa/-/blob/23.1/src/egl/drivers/dri2/egl_dri2.c#L1404 Might be a problem with glibc, the C compiler or a wrong use of sizeof ?
Can confirm I'm also experiencing this issue, except I'm using Hyprland and an AMD Radeon 680m iGPU on a Ryzen 7 Pro 6850U. I fixed it by rolling back to a snapshot of 20230610. I'm using the Packman versions of Mesa, but I experience the same issues when using the openSUSE versions.
I can also confirm as well. I rolled back to snapshot "20230610". I used Hyprland and when I distro-upgraded today, I noticed that it was Mesa 23.1. Not sure if it's Mesa *entirely* but I experimented a bit. The best way is to rollback for now because I am experiencing graphical issues when locking the Mesa package to a version before 23.1. As for my setup, I am using a laptop with an NVIDIA 3060 Mobile GPU with Intel i5-10300H. Mesa is from openSUSE, not from Packman.
This is an issue with OpenSUSE Mesa, wlroots-based renders start properly when Mesa is built locally and not installed from Factory. You can try by building Mesa locally, then running `meson devenv` and running sway from that env. I don't know what's wrong with our build but building Mesa-dri and Mesa together resolves the issue. I had similar issues few months back when trying out a Mesa repos on OBS that built from the master branch and reported it*, but didn't realize it would be an issue with how opensuse builds mesa. * https://gitlab.freedesktop.org/mesa/mesa/-/issues/8394
I believe we should change the title of this bug report then since we confirmed it's Mesa?
> That makes sway use the pixman renderer. Another workaround is to set the env var WLR_RENDERER=pixman before starting sway.
You should not use the pixman renderer, that's software rendering. The vulkan backend should still work just fine, just set WLR_BACKEND=vulkan for now
(In reply to llyyr from comment #16) > You should not use the pixman renderer, that's software rendering. The > vulkan backend should still work just fine, just set WLR_BACKEND=vulkan for > now sorry, WLR_RENDERER=vulkan...
(In reply to llyyr from comment #17) > (In reply to llyyr from comment #16) > > You should not use the pixman renderer, that's software rendering. The > > vulkan backend should still work just fine, just set WLR_BACKEND=vulkan for > > now > > sorry, WLR_RENDERER=vulkan... That doesn't work for me, I get the same error as listed here: https://github.com/NixOS/nixpkgs/issues/229108 I'm also using amdgpu.
For now, the solution is just to add package locks to all mesa-related drivers (for me on an AMD system this was zypper addlock libvulkan_radeon-32bit Mesa Mesa-32bit Mesa-dri Mesa-dri-32bit Mesa-gallium Mesa-gallium-32bit Mesa-KHR-devel Mesa-libEGL1 Mesa-libEGL-devel Mesa-libGL1 Mesa-libGL1-32bit Mesa-libGL-devel Mesa-vulkan-device-select-32bit, but obviously if you are on an nvidia or intel system this will differ), and continue updating your system as normal otherwise. Hopefully this gets fixed soon.
(In reply to Joan Torres from comment #10) > The issue still is why fails when using the EGL renderer. > > The failure happens here: > https://gitlab.freedesktop.org/mesa/mesa/-/blob/23.1/src/egl/drivers/dri2/ > egl_dri2.c#L1404 > > Might be a problem with glibc, the C compiler or a wrong use of sizeof ? This is really odd, using ltrace it seems malloc is called with no argument? ... libEGL.so.1->malloc() = <void> libEGL_mesa.so.0->malloc() = <void> libgallium_dri.so->malloc() = <void> libgallium_dri.so->malloc() = <void> ... libgallium_dri.so->malloc() = <void> libgallium_dri.so->malloc() = <void> libgallium_dri.so->malloc() = <void> 00:00:00.854 [ERROR] [wlr] [EGL] command: eglCreateContext, error: EGL_BAD_ALLOC (0x3003), message: "dri2_create_context" 00:00:00.854 [ERROR] [wlr] [render/egl.c:409] Failed to create EGL context 00:00:00.854 [ERROR] [wlr] [render/egl.c:554] Failed to initialize EGL context libEGL.so.1->malloc() = <void> libgallium_dri.so->malloc() = <void> libgallium_dri.so->malloc() = <void> 00:00:00.858 [ERROR] [wlr] [render/gles2/renderer.c:679] Could not initialize EGL 00:00:00.858 [DEBUG] [wlr] [render/wlr_renderer.c:271] Failed to create a GLES2 renderer. Skipping! 00:00:00.858 [ERROR] [wlr] [render/wlr_renderer.c:333] Could not initialize renderer 00:00:00.858 [ERROR] [sway/server.c:79] Failed to create renderer
(In reply to llyyr from comment #13) > This is an issue with OpenSUSE Mesa, wlroots-based renders start properly > when Mesa is built locally and not installed from Factory. You mean when you're running osc build openSUSE_Tumbleweed x86_64 osc build -M drivers openSUSE_Tumbleweed x86_64 on your test machine and install the generated packages it just works for you? Weird ... > You can try by building Mesa locally, then running `meson devenv` and > running sway from that env. Not sure what 'meson devenv' does .., > I don't know what's wrong with our build but building Mesa-dri and Mesa > together resolves the issue. See above.
*** Bug 1212433 has been marked as a duplicate of this bug. ***
The problem is with this new change: https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/mesa/main/context.c#L1006 Packages from Mesa-dri are built with: -Dgles1=disabled -Dgles2=disabled I'm already changing the build args to fix it.
Just a driver-by comment: Starting weston currently aborts with the error that the EGL_ANDROID_native_fence_sync extension is missing. For now, I assume that it is caused by the same problem.
s/driver-by/drive-by/
(In reply to Joan Torres from comment #23) > The problem is with this new change: > https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/mesa/main/context. > c#L1006 > > Packages from Mesa-dri are built with: > > -Dgles1=disabled -Dgles2=disabled > > I'm already changing the build args to fix it. Thanks, Joan. With re-enabling this I guess we need to remove some libs after build of -drivers in specfile.
Please, can someone test the fix? zypper addrepo https://download.opensuse.org/repositories/home:jtorres:branches:X11:XOrg/openSUSE_Tumbleweed/home:jtorres:branches:X11:XOrg.repo zypper refresh zypper install -f -r home_jtorres_branches_X11_XOrg Mesa-dri
Thanks Joan, this works for me on my VM. I don't see anything dodgy in journalctl or dmesg either.
Hi Joan (In reply to Joan Torres from comment #27) > Please, can someone test the fix? > > > zypper addrepo > https://download.opensuse.org/repositories/home:jtorres:branches:X11:XOrg/ > openSUSE_Tumbleweed/home:jtorres:branches:X11:XOrg.repo > zypper refresh > zypper install -f -r home_jtorres_branches_X11_XOrg Mesa-dri I can confirm that this resolves the problem with weston. Thanks a lot!
Thank you. Sent a SR: https://build.opensuse.org/request/show/1093479. Closing this as FIXED.
(In reply to Joan Torres from comment #27) > Please, can someone test the fix? > > > zypper addrepo > https://download.opensuse.org/repositories/home:jtorres:branches:X11:XOrg/ > openSUSE_Tumbleweed/home:jtorres:branches:X11:XOrg.repo > zypper refresh > zypper install -f -r home_jtorres_branches_X11_XOrg Mesa-dri I confirm this fixes the problem seen on aarch64 - Originally reported as bug#1212433
This is an autogenerated message for OBS integration: This bug (1212345) was mentioned in https://build.opensuse.org/request/show/1093496 Factory / Mesa
Will this fix automatically propagate to the Packman set of Mesa drivers? If not, how can we go about getting it fixed in Packman?
(In reply to Ed Jackson from comment #33) > Will this fix automatically propagate to the Packman set of Mesa drivers? If > not, how can we go about getting it fixed in Packman? Honestly I don't know anything about the build of the Mesa Packman package ...
(In reply to Stefan Dirsch from comment #34) > (In reply to Ed Jackson from comment #33) > > Will this fix automatically propagate to the Packman set of Mesa drivers? If > > not, how can we go about getting it fixed in Packman? > > Honestly I don't know anything about the build of the Mesa Packman package > ... The packman Mesa package is here https://pmbs.links2linux.org/package/show/Essentials/A_tw-Mesa And it seems there is no linkdiff compared to Factory Mesa, so it should propagate just fine.
*** Bug 1212478 has been marked as a duplicate of this bug. ***
*** Bug 1212481 has been marked as a duplicate of this bug. ***
*** Bug 1212324 has been marked as a duplicate of this bug. ***
I started an OpenQA test which should catch similar issues: https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/17285