Bugzilla – Bug 1226220
X session ends abruptly
Last modified: 2024-07-08 11:16:16 UTC
My X session has crashed abruptly 3 days in a row (Monday, Tuesday, Wednesday). During the 8-9-10 hours I'm working, this has happened only once every day. I do have the latest Nvidia drivers provided by the Tumbleweed repository. I use X11, not Wayland, in case this gives any further information. Tumbleweed is pretty updated. I'm attaching a supportconfig, I know the rest of this report doesn't give much information. It looks like there seems to be some problem with the latest drivers, some source: https://www.gamingonlinux.com/2024/06/you-may-want-to-avoid-nvidia-driver-550-if-youre-on-a-laptop/ Lenovo ThinkPad P15 Gen 2i Supportconfig within Engineering internal network, shared in my Export: https://w3.suse.de/~rosuna/supportconfig/ Let me know if someone else needs to access it in a more public place. Happy to report it against Nvidia if you consider that (and if you tell me how).
If you're looking for timestamps in the logs, the last crash was not long before the supportconfig was taken.
Could also be related to latest 6.9 kernel.
grep nvidia messages.txt |grep -i error|cut -d " " -f 3-30 kernel: [drm:nv_drm_atomic_commit [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to apply atomic modeset. Error code: -22 kernel: [drm:nv_drm_atomic_commit [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to apply atomic modeset. Error code: -22 kernel: [drm:nv_drm_atomic_commit [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to apply atomic modeset. Error code: -22 kernel: [drm:nv_drm_atomic_commit [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to apply atomic modeset. Error code: -22 kernel: [drm:nv_drm_atomic_commit [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to apply atomic modeset. Error code: -22 kernel: [drm:nv_drm_atomic_commit [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to apply atomic modeset. Error code: -22 kernel: [drm:nv_drm_atomic_commit [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to apply atomic modeset. Error code: -22 kernel: [drm:nv_drm_atomic_commit [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to apply atomic modeset. Error code: -22 kernel: [drm:nv_drm_atomic_commit [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to apply atomic modeset. Error code: -22 kernel: [drm:nv_drm_atomic_commit [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Flip event timeout on head 0 kernel: [drm:nv_drm_atomic_commit [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Flip event timeout on head 1 kernel: [drm:nv_drm_atomic_commit [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Flip event timeout on head 2 kernel: [drm:nv_drm_atomic_commit [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to apply atomic modeset. Error code: -22 kernel: [drm:nv_drm_atomic_commit [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Flip event timeout on head 0 kernel: [drm:nv_drm_atomic_commit [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Flip event timeout on head 1 kernel: [drm:nv_drm_atomic_commit [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Flip event timeout on head 2 kernel: [drm:nv_drm_atomic_commit [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to apply atomic modeset. Error code: -22 kernel: [drm:nv_drm_atomic_commit [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Flip event timeout on head 0 kernel: [drm:nv_drm_atomic_commit [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Flip event timeout on head 1 kernel: [drm:nv_drm_atomic_commit [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Flip event timeout on head 2 kernel: [drm:nv_drm_atomic_commit [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to apply atomic modeset. Error code: -22 kernel: [drm:nv_drm_atomic_commit [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to apply atomic modeset. Error code: -22 kernel: [drm:nv_drm_atomic_commit [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to apply atomic modeset. Error code: -22 kernel: [drm:nv_drm_atomic_commit [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Flip event timeout on head 0 kernel: [drm:nv_drm_atomic_commit [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to apply atomic modeset. Error code: -22 kernel: [drm:nv_drm_atomic_commit [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Flip event timeout on head 0 kernel: [drm:nv_drm_atomic_commit [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to apply atomic modeset. Error code: -22 kernel: [drm:nv_drm_atomic_commit [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Flip event timeout on head 0 kernel: [drm:nv_drm_atomic_commit [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to apply atomic modeset. Error code: -22 kernel: [drm:nv_drm_atomic_commit [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Flip event timeout on head 0 kernel: [drm:nv_drm_atomic_commit [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to apply atomic modeset. Error code: -22 kernel: [drm:nv_drm_atomic_commit [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to apply atomic modeset. Error code: -22 kernel: [drm:nv_drm_revoke_sub_ownership [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to revoke sub-ownership from NVKMS kernel: [drm:nv_drm_master_drop [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] nv_drm_atomic_helper_disable_all failed with error code -22 !
If this isn't a regression of the driver, I suggest to try with an older kernel < 6.9. https://download.opensuse.org/history/ https://download.opensuse.org/history/20240523/tumbleweed/repo/oss/x86_64/ We need to figure out if it's a driver or kernel regression. Driver 550.78 is still available.
(In reply to Stefan Dirsch from comment #4) > If this isn't a regression of the driver, I suggest to try with an older > kernel < 6.9. > > https://download.opensuse.org/history/ > https://download.opensuse.org/history/20240523/tumbleweed/repo/oss/x86_64/ > > We need to figure out if it's a driver or kernel regression. Driver 550.78 > is still available. It's my workstation and I have not such an easy to go back and forth testing. Specially if I have no clue how to trigger the crash (BTW, in case it was not clear what I meant with "crash": it means, the X session dies and after a couple seconds I'm at the initial login screen of the window manager). Today it has not crashed so far, and I'm still on the same kernel and driver version: raul@mordor:~$ uname -r 6.9.3-1-default raul@mordor:~$ rpm -qa|grep -i nvidia-drivers nvidia-drivers-G06-550.90.07-23.1.x86_64 If I go to an old driver, or to an old kernel, how long do I need to stay there to consider it "not crashing"?
Removing needinfo till I really know what/how/when to test.
Thanks. Understood. Probably you would need to test a few days without crashes with the old kernel/driver to verify that it's a regression.
System did not crash since I opened the bug. It did not shut down properly once though, not sure whether related or not. Anyway, there's an update from 6.9.3-1 to 6.9.4-1, which I'm applying right now. Will report back if anything changes (otherwise, feel free to close the bug after a reasonable time with "worksforme" or something similar).
Thanks for the update!
Ok. Let's assume for now that things have improved. Closing.