Bug 1204143

Summary: Nvidia G06 & G05 drivers fail to build against kernel 6.0.0
Product: [openSUSE] openSUSE Tumbleweed Reporter: Bruno Pitrus <brunopitrus>
Component: X11 3rd Party DriverAssignee: Stefan Dirsch <sndirsch>
Status: RESOLVED FIXED QA Contact: Stefan Dirsch <sndirsch>
Severity: Critical    
Priority: P1 - Urgent CC: ahjolinna, bjoernv, c.j, David.e.holmberg, devguy.ca, epistemepromeneur, hpj, kairo, krinpaus, marcus.gama, Mathias.Homann, oberkut, shawn.peterson, simon.vogl, t.rother, werwolf131313
Version: Current   
Target Milestone: ---   
Hardware: x86-64   
OS: openSUSE Tumbleweed   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---

Description Bruno Pitrus 2022-10-07 15:58:33 UTC
After upgrade to kernel-default 6.0.0-1.2, the nvidia-gfxG06-kmp-default kernel modules version 515.65.01_k5.18.15_1 fail to rebuild:

  CC [M]  /usr/src/kernel-modules/nvidia-515.65.01-default/nvidia-drm/nvidia-drm-helper.o
/usr/src/kernel-modules/nvidia-515.65.01-default/nvidia-drm/nvidia-drm-helper.c: In function ‘__nv_drm_framebuffer_put’:
/usr/src/kernel-modules/nvidia-515.65.01-default/nvidia-drm/nvidia-drm-helper.c:47:5: error: implicit declaration of function ‘drm_framebuffer_put’ [-Werror=implicit-function-declaration]
   47 |     drm_framebuffer_put(fb);
      |     ^~~~~~~~~~~~~~~~~~~
cc1: some warnings being treated as errors
Comment 1 Simon Vogl 2022-10-07 17:42:04 UTC
Can confirm this on my system.
I guess updating NVIDIA drivers to 515.76 would fix the issue, but that will also cause numerous issues on RTX 3000 cards that are connected via HDMI...
This seems to be the necessary commit in nvidia-open, maybe it can somehow be backported to the proprietary driver?
https://github.com/NVIDIA/open-gpu-kernel-modules/commit/fe0728787f80fa3f5f3d5d2a3e36a7c83ea6b178#diff-af613fd5f7937ef7a996b771ee0fb193a290b7f8795e706173b701bd9a791402
Comment 2 Andreas Stieger 2022-10-07 21:48:05 UTC
*** Bug 1204150 has been marked as a duplicate of this bug. ***
Comment 3 Alexander Ahjolinna 2022-10-08 02:32:29 UTC
yeah it was nice surprise when I rebooted my system after the 6.0 kernel update and system didn't work (sigh), it would nice if there was some kind of a check that SUSE wouldn't release a major kernel update if the nvidia drivers wont work with it (yes I know its pain, but what can you do...it's nvidia)


well at least I could build 515.76 on my own obs repo and update my system to it...but I dont think this should be expected behaviour
Comment 4 C J 2022-10-08 09:56:24 UTC
Same problem with G05 drivers.

Logging into gnome with gdm throws me back to the login screen. (rebooting into 5.19.13 fixes it).

How can they push a kernel to the live repositories without even testing that the basic display drivers work for the basic gfx card lines?

Thanks for looking into fixing this.
Comment 5 Bruno Pitrus 2022-10-08 10:03:04 UTC
c.j@tuta.io the NVIDIA drivers are not part of the official repositories, not tested, and they are distributed by NVIDIA, not SUSE.
Comment 6 Episteme PROMENEUR 2022-10-08 10:28:21 UTC
(In reply to dziobian from comment #5)
> c.j@tuta.io the NVIDIA drivers are not part of the official repositories,
> not tested, and they are distributed by NVIDIA, not SUSE.

Yes, but this is the problem of SUSE to deliver a good OS. SUSE cannot ignore that there are users who use Nvidia driver.

It is possible in rpm installation to test the version of the NVIDIA kernel part. No ?

How to get a good OS if everyone does not care of the compliance of his work to the others? It's a team work.
Comment 7 Stefan Dirsch 2022-10-08 11:00:26 UTC
*** Bug 1204152 has been marked as a duplicate of this bug. ***
Comment 8 Stefan Dirsch 2022-10-08 11:17:29 UTC
Sorry, guys! I missed the .76 update completely. Also nobody cared to inform me about the planned Kernel 6.0 update in TW. :-( I noticed nvidia build failing against 6.0-pre since some time ... if I would have know a few days before we would not be in this situation now.

Anyway, I have prepared packages now (also G05 and G04) and will push them towards NVIDIA later today.
Comment 9 Episteme PROMENEUR 2022-10-08 11:19:34 UTC
@stefan

thanks
Comment 10 C J 2022-10-08 11:24:05 UTC
Many Thanks !
Comment 11 Episteme PROMENEUR 2022-10-08 11:37:00 UTC
I wonder if the old dkms technology is better.
Less work adn more reliable. The nvidia driver is automatically updated when the kernel is updated.
Comment 12 Mathias Homann 2022-10-08 11:47:38 UTC
(In reply to Stefan Dirsch from comment #8)

> Anyway, I have prepared packages now (also G05 and G04) and will push them
> towards NVIDIA later today.


I have no idea where in the world you are but I have a couple of beers with your name on them in my fridge...
Comment 13 Stefan Dirsch 2022-10-08 12:54:40 UTC
(In reply to Episteme PROMENEUR from comment #11)
> I wonder if the old dkms technology is better.
> Less work and more reliable. The nvidia driver is automatically updated when
> the kernel is updated.

Is that part of DKMS, that sources are updated as well? AFAIK it just rebuilds the sources when booting a new kernel, for which the module has not been built yet. About the same does our "KMP" package for Tumbleweed ...
Comment 14 Stefan Dirsch 2022-10-08 12:56:34 UTC
(In reply to Stefan Dirsch from comment #8)
> Anyway, I have prepared packages now (also G05 and G04) and will push them
> towards NVIDIA later today.

done
Comment 15 Thomas Rother 2022-10-08 16:39:41 UTC
Thanks Stefan for the fast reaction. "internal communication" at SUSE needs a bit of improvement, I guess ;-) Can we download the rpm directly somewhere of just wait for the repo update?
Comment 16 Stefan Dirsch 2022-10-08 17:03:38 UTC
(In reply to Thomas Rother from comment #15)
> Thanks Stefan for the fast reaction. "internal communication" at SUSE needs
> a bit of improvement, I guess ;-) Can we download the rpm directly somewhere
> of just wait for the repo update?

Unfortunately no, but you can build it yourself if you're familiar with our buildservice.

  https://build.opensuse.org/package/show/X11:Drivers:Video/nvidia-gfxG06
Comment 17 Thomas Rother 2022-10-09 07:26:15 UTC
I tested this morning with a manual compilation of NVIDIA-Linux-x86_64-515.76.run and it worked without any issues on kernel 6.0.0-1-default.
Comment 18 Episteme PROMENEUR 2022-10-09 07:52:44 UTC
I updated my headless installation of the nvidia card with 515.76 from

https://download.opensuse.org/repositories/home:ahjolinna/openSUSE_Tumbleweed/home:ahjolinna.repo

Success
Comment 19 David Holmberg 2022-10-10 09:38:38 UTC
(In reply to Stefan Dirsch from comment #16)
> (In reply to Thomas Rother from comment #15)
> > Thanks Stefan for the fast reaction. "internal communication" at SUSE needs
> > a bit of improvement, I guess ;-) Can we download the rpm directly somewhere
> > of just wait for the repo update?
> 
> Unfortunately no, but you can build it yourself if you're familiar with our
> buildservice.
> 
>   https://build.opensuse.org/package/show/X11:Drivers:Video/nvidia-gfxG06

Any update on when they will be available from the standard repos?
Comment 20 Stefan Dirsch 2022-10-10 09:41:41 UTC
(In reply to David Holmberg from comment #19)
> Any update on when they will be available from the standard repos?

Not before tonight. Hopefully somewhen tomorrow. That's all I can say ...
Comment 21 Hans-Peter Jansen 2022-10-10 11:32:54 UTC
But be aware, that the G04/G05 changes for 6.0 will disable the ACPI interface for those drivers.. 

I would be interested in feedback regarding this!
Comment 22 Dmitry Markov 2022-10-10 11:36:56 UTC
(In reply to Hans-Peter Jansen from comment #21)
> But be aware, that the G04/G05 changes for 6.0 will disable the ACPI
> interface for those drivers.. 
> 
> I would be interested in feedback regarding this!

sorry for the possibly stupid question, but what exactly does that mean? I understand correctly that, for example, the hibernation state will become unavailable, as well as fan control?
Comment 23 Thomas Rother 2022-10-10 12:58:00 UTC
Side question: The naming schemes at NVIDIA are really "non-intuitive", I normally use the "G05" versions "for GeForce 600 series and newer". But thats more a "tradition" than a technical decision (hardware is NVIDIA Corporation TU117 [GeForce GTX 1650] ). The G06 versions are for "GeForce 700 series and newer". 

What would be the correct version for my hardware?
Comment 24 Mathias Homann 2022-10-10 13:15:13 UTC
(In reply to Thomas Rother from comment #23)

> (hardware is NVIDIA Corporation TU117 [GeForce GTX 1650]).
> The G06 versions are for "GeForce 700 series and newer". 
> 
> What would be the correct version for my hardware?

I'm running G06 with the same graphics card...
Comment 25 Episteme PROMENEUR 2022-10-10 13:27:22 UTC
I understand this.

The last G06 is for all cards

G06 = G05 + last cards

G05 = G06 - new cards in G06

G04 = G05 - new cards in G05

In case of nothing is installed about your nvidia card for any reason

To auto-detect and install the right driver for your hardware, run:

sudo zypper install-new-recommends --repo <name of the nvidia repo>

If you want to know <name of the nvidia repo> then run

sudo zypper repos

One way to determine the appropriate driver is to input your hardware information into Nvidia's driver search engine

https://www.nvidia.com/Download/index.aspx
Comment 26 Stefan Dirsch 2022-10-10 13:31:54 UTC
G06 supports anything later than Kepler microarchitecture. nVidia removed support for Kepler in 515.xx. Therefore I introduced G06 driver series.

https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units
Comment 27 Fabian Vogt 2022-10-10 14:46:19 UTC
*** Bug 1204187 has been marked as a duplicate of this bug. ***
Comment 28 Shawn Peterson 2022-10-10 16:42:34 UTC
Is there somewhere I can check to see if the repo has been updated?  Thanks for the quick fix, Stefan.
Comment 29 Mathias Homann 2022-10-10 17:19:13 UTC
(In reply to Shawn Peterson from comment #28)
> Is there somewhere I can check to see if the repo has been updated?  Thanks
> for the quick fix, Stefan.

"zypper ref" should do it... ;)
Comment 30 Episteme PROMENEUR 2022-10-11 11:05:30 UTC
We need an automatism to manage this problem and to forbid any update of the kernel if it is not compliant to the nvidia driver.
Comment 31 Mathias Homann 2022-10-11 11:57:16 UTC
(In reply to Stefan Dirsch from comment #20)
> (In reply to David Holmberg from comment #19)
> > Any update on when they will be available from the standard repos?
> 
> Not before tonight. Hopefully somewhen tomorrow. That's all I can say ...

The main problem with all the nvidia drivers is that "someone at nvidia" has to manually (i think) publish the finished RPM packages to the repo - and usually forgets one step or the other, so the availability of new nvidia drivers more often than not announces itself by the repo being broken for a day or two 0.o

so .. why is it not simply published automatically and reliably on OBS instead...? I mean, obviously because NVIDIA says so, but what is their reasoning (if any)?
Comment 32 Episteme PROMENEUR 2022-10-11 12:03:25 UTC
I hope the open driver will be mature soon.
Comment 33 Hans-Peter Jansen 2022-10-11 14:44:38 UTC
(In reply to Episteme PROMENEUR from comment #30)
> We need an automatism to manage this problem and to forbid any update of the
> kernel if it is not compliant to the nvidia driver.

I've tried that already. Result boils down to:
We (the SUSE kernel hackers) will not submit to the dictates of a proprietary package.
Comment 34 Alexander Novichkov 2022-10-11 14:48:37 UTC
The update has arrived. The driver is working. Thank.
Comment 35 Robert Kaiser 2022-10-11 14:50:43 UTC
(In reply to Episteme PROMENEUR from comment #32)
> I hope the open driver will be mature soon.

Don't get your hopes up too much as if anything, it will only be available for rather new GPUs. Also see https://lwn.net/Articles/910343/ (article is currently subscriber-only but will be publicly available in a day or two, i.e. a week after it was initially published).
Comment 36 Stefan Dirsch 2022-10-11 14:51:43 UTC
Oh. Thanks for letting us know. Indeed the repositories have been updated. So let's close this ticket.
Comment 37 Björn Voigt 2022-10-11 20:28:05 UTC
(In reply to Hans-Peter Jansen from comment #33)
> I've tried that already. Result boils down to:
> We (the SUSE kernel hackers) will not submit to the dictates of a
> proprietary package.

This means basically, if the SUSE kernel hackers have a choice between waiting some days for a proprietary package and frustrating all Nvidia users, they choose the latter.
Comment 38 Dev Guy 2022-10-12 00:20:13 UTC
Thank you for this fix, I can confirm it's working for me!
Comment 39 Episteme PROMENEUR 2022-10-13 09:09:11 UTC
Hello

I found that interesting info : 

NVIDIA 515.76 Driver Released With Bug Fixes, Linux 6.0 Compatibility on 20 September 2022 !

https://www.phoronix.com/news/NVIDIA-515.76-Linux-Driver