Bug 1219550

Summary: kernel 6.7.2 and nvidia kmp 6.6.2 => no more CUDA, opencl
Product: [openSUSE] openSUSE Tumbleweed Reporter: Episteme PROMENEUR <epistemepromeneur>
Component: Kernel:DriversAssignee: Kernel Bugs <kernel-bugs>
Status: RESOLVED WORKSFORME QA Contact: E-mail List <qa-bugs>
Severity: Normal    
Priority: P5 - None CC: epistemepromeneur, sndirsch, tiwai
Version: Current   
Target Milestone: ---   
Hardware: Other   
OS: Other   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: fahclient.service

Description Episteme PROMENEUR 2024-02-05 07:25:49 UTC
Since the big update (about 5000 packages) yesterday (2024/02/04),
there is no more CUDA or opencl working.

hypothesis : kernel 6.7.2 and nvidia kmp 6.6.2
Comment 1 Episteme PROMENEUR 2024-02-07 08:33:32 UTC
When i say "no more CUDA or opencl working".
then i say FAH app does not find any nvidia cuda or nvidia opencl.

FAH is launched by a systemd service.

If I disable the service and I launch FAH by a command line in a konsole under my user id
then
FAH finds nvidia CUDA and nvidia opencl !

So in a root context FAH does not find any cuda and opencl
but in user id context FAH finds cuda and opencl.

What did it change after the big update ?

FAH has not changed since 2021.
Comment 2 Stefan Dirsch 2024-02-07 09:38:30 UTC
I don't know FAH. No idea what this systemd service does. We need much more details here.
Comment 3 Episteme PROMENEUR 2024-02-07 10:33:01 UTC
Created attachment 872529 [details]
fahclient.service
Comment 4 Episteme PROMENEUR 2024-02-07 10:33:26 UTC
FAH is folding@home.

https://foldingathome.org

Science computing about protein folding.

Sharing cpu and gpu for computing.

https://foldingathome.org/start-folding/

I don't use rpms from folding@home.

I use rpms from repo "curiosity"

https://download.opensuse.org/repositories/home:Curiosity/openSUSE_Tumbleweed/home:Curiosity.repo

fahclient.service contents : see the attached file
Comment 5 Episteme PROMENEUR 2024-02-07 10:43:46 UTC
I forgot to mention that I installed again Nvidia G06 (full installation).
Comment 6 Episteme PROMENEUR 2024-02-07 10:44:10 UTC
and FAH
Comment 7 Stefan Dirsch 2024-02-07 10:54:33 UTC
Ok. Then I suggest you try as root user to run simplified OpenCL and CUDA test apps, e.g. 

  /usr/local/cuda-<version>/extras/demo_suite/deviceQuery

for CUDA and some sample app for OpenCL. I'm not so familiar with OpenCL. You'll find something. The you can strace that with

   strace -f -e trace=file <app> 2>&1 

to figure out if there is some config file/lib not found or not accessible. root may have another environment and does not find the right .icd file or alike.

Sorry, I can give you only some hints. I can't and won't support this application.
Comment 8 Episteme PROMENEUR 2024-02-08 18:59:52 UTC
I found the solution. Thanks to internet.

It's a workaround.

We must run "clinfo" before fahclient at PC starting.

I put "clinfo" command in a custom service. This service happily (I don't know why) is started before fahclient.

It works.

Now fahclient finds cuda and opencl.

Is fahclient detects cuda and opencl with an obsolete command ?

Is Tumbleweed stopped supplying some data about cuda and opencl that faclient needs to find cuda and opencl ?
Comment 9 Stefan Dirsch 2024-02-08 19:17:10 UTC
No idea. Might be a timing issue. Driver not loaded yet for example. Whatever. Closing.