Bugzilla – Bug 1212394
[wish] opencl at the same time with iGPU and dGPU
Last modified: 2024-05-06 23:22:07 UTC
Hello I use FAH. This app does science computing. This app can run 3 jobs at the same time : one with the CPU, one with the iGPU (Intel), one with the dGPU (NVIDIA). If I install Intel OpenCL. This OpenCL is not available for computing. Only the NVIDIA OpenCL is available. It would be a good thing that the Intel OpenCL is available at the same time. Thanks Last year, Intel OpenCL was available at the same time as NVIDIA OpenCL. Installed packages : libOpenCL1 intel-opencl fetch-fahclient-devel I checked that “multi-monitor” in the bios is enabled. What clinfo says : :~> clinfo Number of platforms 1 Platform Name NVIDIA CUDA Platform Vendor NVIDIA Corporation Platform Version OpenCL 3.0 CUDA 12.0.151 Platform Profile FULL_PROFILE ....
Stefan, do you know if the nvidia driver does something funny with the OpenCL/vendors/ folder or is this more likely an issue with intel-opencl?
(In reply to Patrik Jakobsson from comment #1) > Stefan, do you know if the nvidia driver does something funny with the > OpenCL/vendors/ folder or is this more likely an issue with intel-opencl? For nvidia driver packages (nvidia-gl) I'm using /etc/OpenCL/vendors /etc/OpenCL/vendors/nvidia.icd
Episteme, can you provide the output from: $ ls -la /etc/OpenCL/vendors/ When both packages are installed you should have both the intel and nvidia icd files.
I explore the problem. It seems it's an alternative feature issue. /usr/lib64/libOpenCL.so -> /usr/lib64/OpenCL.so.1 -> /etc/alternatives/libOpenCL.so.1 -> /usr/lib64/nvidia/libOpenCL.so.1 This installation assumes an exclusive choice, not an inclusive choice. No ?
:~> ls -la /etc/OpenCL/vendors/ total 4 drwxr-xr-x 1 root root 20 9 mai 21:20 . drwxr-xr-x 1 root root 14 9 mai 21:20 .. -rw-r--r-- 1 root root 22 9 mai 21:20 nvidia.icd
inter-opencl installation /usr/share/OpenCL /usr/share/OpenCL/vendors /usr/share/OpenCL/vendors/intel.icd
(In reply to Episteme PROMENEUR from comment #4) > I explore the problem. > > It seems it's an alternative feature issue. > > /usr/lib64/libOpenCL.so -> /usr/lib64/OpenCL.so.1 -> > /etc/alternatives/libOpenCL.so.1 -> /usr/lib64/nvidia/libOpenCL.so.1 > > This installation assumes an exclusive choice, not an inclusive choice. No ? Ah, right. nVidia has its own OpenCL lib.
ok, so if you're using libopencl from the nvidia package it will only look for icd files in /etc/OpenCL/vendors and not in /usr/share/OpenCL/vendors where the intel file is. Can you try to create a file link to verify this: $ ln -s /usr/share/OpenCL/vendors/intel.icd /etc/OpenCL/vendors/ And then check if both platforms are shown when running clinfo
:~> clinfo Abort was called at 36 line in file: /home/abuild/rpmbuild/BUILD/compute-runtime-23.13.26032.30/shared/source/built_ins/built_ins.cpp Abandon (core dumped)
I guess Mesa's CL drivers don't work together with nVidia's libopencl.
Good, then you're actually loading the intel.icd file. Unfortunately you're hitting a different bug (bsc#1212193). I've submitted a fix but it has not yet been accepted.
My installation is some sort of corner case. The monitor is connected to the Intel iGPU, not the Nvidia dGPU. If I want some quality for image then I use environment variable like this : __NV_PRIME_RENDER_OFFLOAD=1 __GLX_VENDOR_LIBRARY_NAME=nvidia digikam or __NV_PRIME_RENDER_OFFLOAD=1 __GLX_VENDOR_LIBRARY_NAME=nvidia vlc
>> I guess Mesa's CL drivers don't work together with nVidia's libopencl. Again, last year there were no problem. I get the same problem with eglinfo. i get a crash. see my report https://bugzilla.opensuse.org/show_bug.cgi?id=1211739
(In reply to Episteme PROMENEUR from comment #13) > >> I guess Mesa's CL drivers don't work together with nVidia's libopencl. > > Again, last year there were no problem. > > I get the same problem with eglinfo. i get a crash. see my report > > https://bugzilla.opensuse.org/show_bug.cgi?id=1211739 I think my assumptions were wrong. See Patrik's comment#11. But if you want to make sure to use the system's libOpenCL and not the one of nVidia, you can change this with update-alternatives. It just gets the default when installing the nvidia driver for obvious reasons.
If you really need to use both libOpenCLs you'll need to experiment with LD_LIBRARY_PATH/LD_PRELOAD env. variables ... But hopefully things are just working with nVidia's libOpenCL.
(In reply to Stefan Dirsch from comment #14) > I think my assumptions were wrong. See Patrik's comment#11. But if you want > to make sure to use the system's libOpenCL and not the one of nVidia, you > can change this with update-alternatives. It just gets the default when > installing the nvidia driver for obvious reasons. But will the nvidia.icd file be found by the system's libopencl? Does the system libopencl check both /usr/share/OpenCL/vendors and /etc/OpenCL/vendors?
>> But if you want to make sure to use the system's libOpenCL and not the one of nVidia No, please read again the description. I use FAH. This app does science computing. This app can run 3 jobs at the same time : one with the CPU, one with the iGPU (Intel), one with the dGPU (NVIDIA). Today fah runs 2 jobs : - one with the CPU - one with the NVIDIA dGPU and CUDA. I want one more job with the intel iGPU and OpenCL.
Ok. So just switch to system's OpenCL with update-alternatives. And you need to wait for Patrik's fix. See his comment#11.
I sum up the problem : a) today with our desktop we get the ability to use an iGPU and a dGPU as we use them with a laptop see there are two ways for this (monitor is connected to the iGPU in the two cases) : - prime technology see https://forums.opensuse.org/t/howto-tumbleweed-desktop-using-nvidia-prime/165724 - Nvidia Prime render offload see https://forums.opensuse.org/t/howto-tumbleweed-desktop-nvidia-prime-render-offload/165723 b) we must get the ability to run computing jobs at the same time with the iGPU and the dGPU. a) and b) must be satisfied at the same time.
libOpenCL of ocl-icd package tries to open /etc/OpenCL/vendors. If this fails it uses /usr/share/OpenCL/vendors. You can overwrite this path by setting OCL_ICD_VENDORS env. variable. So as regular user you could create your own vendor directory, set this as your OCL_ICD_VENDORS env variable and link from that directory to the .icd files you want to use. Then the OPenCL software can chose which driver to use. If all these drivers then can be loaded and are working by any libOpenCL (sytem one and the one of NVIDIA) is a different question.
(In reply to Episteme PROMENEUR from comment #17) > I use FAH. This app does science computing. I'm providing the FAH packages. The library name libOpenCL.so is hard coded in FAH (closed source, sorry). This file is not available on openSUSE. My package fetch-fahclient-devel provides a link from libOpenCL.so to libOpenCL.so.1 to make it work on openSUSE. I'd try: # ln -sf /usr/lib64/ocl-icd/libOpenCL.so.1 /usr/lib64/libOpenCL.so This way we can force FAH to use the libOpenCL required for Intel GPU. I am open to suggestions for improvement! Bernhard
(In reply to Bernhard Held from comment #21) > (In reply to Episteme PROMENEUR from comment #17) > > I use FAH. This app does science computing. > > I'm providing the FAH packages. > > The library name libOpenCL.so is hard coded in FAH (closed source, sorry). > This file is not available on openSUSE. My package fetch-fahclient-devel > provides a link from libOpenCL.so to libOpenCL.so.1 to make it work on > openSUSE. > > I'd try: > # ln -sf /usr/lib64/ocl-icd/libOpenCL.so.1 /usr/lib64/libOpenCL.so > > This way we can force FAH to use the libOpenCL required for Intel GPU. > > I am open to suggestions for improvement! > Bernhard It fails. 10:26:40:WARNING:FS01:No CUDA or OpenCL 1.2+ support detected for GPU slot 01: gpu:0:2 KBL GT2 [HD Graphics 630]. Disabling.
/usr/lib64/ocl-icd/libOpenCL.so.1 and /usr/lib64/libOpenCL.so already existed.
(In reply to Episteme PROMENEUR from comment #23) > /usr/lib64/ocl-icd/libOpenCL.so.1 > and > /usr/lib64/libOpenCL.so > > already existed. Well, this is expected. As I wrote, libOpenCL.so is provided by package fetch-fahclient-devel. Reinstall to restore the previous link. FAHClient doesn't play nicely with GPUs other than NVIDIA. You might have to search the internet to learn how to make FAH work on Intel GPU. Please use `update-alternatives --config libOpenCL.so.1` and switch to Intel and report the output of `clinfo`. Did FAH ever run on you Intel iGPU UHD 630?
(In reply to Bernhard Held from comment #24) > > Please use `update-alternatives --config libOpenCL.so.1` and switch to Intel > and report the output of `clinfo`. > :~> sudo systemctl restart fahclient [sudo] Mot de passe de root : roubach@grincheux:~> sudo update-alternatives --config libOpenCL.so.1 There are 2 choices for the alternative libOpenCL.so.1 (providing /usr/lib64/libOpenCL.so.1). Selection Path Priority Status ------------------------------------------------------------ * 0 /usr/lib64/nvidia/libOpenCL.so.1 100 auto mode 1 /usr/lib64/nvidia/libOpenCL.so.1 100 manual mode 2 /usr/lib64/ocl-icd/libOpenCL.so.1 50 manual mode Press <enter> to keep the current choice[*], or type selection number: 2 update-alternatives: using /usr/lib64/ocl-icd/libOpenCL.so.1 to provide /usr/lib64/libOpenCL.so.1 (libOpenCL.so.1) in manual mode :~> clinfo Number of platforms 1 Platform Name NVIDIA CUDA Platform Vendor NVIDIA Corporation Platform Version OpenCL 3.0 CUDA 12.0.151 Platform Profile FULL_PROFILE :~> sudo systemctl restart fahclient result : 11:11:10:WARNING:FS01:No CUDA or OpenCL 1.2+ support detected for GPU slot 01: gpu:0:2 KBL GT2 [HD Graphics 630]. Disabling. > Did FAH ever run on you Intel iGPU UHD 630? I am sure that with clinfo Intel iGPU was enabled last year. I don't remember if FAH computed something with the Intel iGPU. Intel iGPU was enabled last year, without doing anything in Tumbleweed, just some settings in FAH about the slot corresponding to the intel iGPU and in the bios. That's why I am surprised this fails. Last year, when I explored the GPU computing, I made two tutorials about using Nvidia dGPU and Intel iGPU. For NVIDIA dGPU https://foldingforum.org/viewtopic.php?t=37551 For Intel iGPU https://foldingforum.org/viewtopic.php?t=37544
> :~> clinfo > Number of platforms 1 > Platform Name NVIDIA CUDA So, your PC is still invested by the NVIDIA driver. How did you install the NVIDIA driver? Using rpms or NVIDIA-Linux-x86_64-xxx.yy.run? Get rid of the latter one. It disables other drivers by removing/renaming a couple of files. I'd suggest to remove anything NVIDIA related until you've got a clean iGPU setup. clinfo has to report a working Intel GPU. Finally, if everything is working including FAH, you might start to reinstall NVIDIA drivers using rpms only.
(In reply to Bernhard Held from comment #26) > How did you install the NVIDIA driver? Using rpms or > NVIDIA-Linux-x86_64-xxx.yy.run? Get rid of the latter one. It disables other > drivers by removing/renaming a couple of files. > I installed the Nvidia driver with rpms.I did not install Prime packages. > I'd suggest to remove anything NVIDIA related until you've got a clean iGPU > setup. clinfo has to report a working Intel GPU. > ok, I will remove the nvidia driver. > Finally, if everything is working including FAH, you might start to > reinstall NVIDIA drivers using rpms only. OK
I removed all packages about nvidia, not the kernel firmware nvidia package because this removes kernel all firmare package. then i restartes the PC. Result : :~> clinfo Number of platforms 1 Platform Name Intel(R) OpenCL HD Graphics Platform Vendor Intel(R) Corporation Platform Version OpenCL 3.0 Platform Profile FULL_PROFILE fahclient does not start. :~> sudo systemctl status fahclient × fahclient.service - Folding@Home V7 Client Loaded: loaded (/usr/lib/systemd/system/fahclient.service; enabled; preset> Active: failed (Result: core-dump) since Sat 2023-06-17 15:02:14 CEST; 20s> Duration: 248ms Docs: https://foldingathome.org/support/faq/installation-guides/linux/ Process: 4837 ExecStart=/usr/bin/FAHClient /etc/fahclient/config.xml --pid-> Process: 4838 ExecStartPost=/bin/sh -c echo $MAINPID >/run/fahclient/fahcli> Main PID: 4837 (code=dumped, signal=SEGV) CPU: 188ms juin 17 15:02:13 grincheux systemd[1]: Starting Folding@Home V7 Client... juin 17 15:02:13 grincheux FAHClient[4837]: 13:02:13:Read GPUs.txt juin 17 15:02:13 grincheux systemd[1]: Started Folding@Home V7 Client. juin 17 15:02:14 grincheux systemd[1]: fahclient.service: Main process exited, > juin 17 15:02:14 grincheux systemd[1]: fahclient.service: Failed with result 'c> lines 1-15/15 (END)...skipping... × fahclient.service - Folding@Home V7 Client Loaded: loaded (/usr/lib/systemd/system/fahclient.service; enabled; preset: disabled) Active: failed (Result: core-dump) since Sat 2023-06-17 15:02:14 CEST; 20s ago Duration: 248ms Docs: https://foldingathome.org/support/faq/installation-guides/linux/ Process: 4837 ExecStart=/usr/bin/FAHClient /etc/fahclient/config.xml --pid-file=/run/fahclient/fahclient.pid $FAHCLIENT_OPTIONS (code=dumped, signal=SEGV) Process: 4838 ExecStartPost=/bin/sh -c echo $MAINPID >/run/fahclient/fahclient.pid (code=exited, status=0/SUCCESS) Main PID: 4837 (code=dumped, signal=SEGV) CPU: 188ms juin 17 15:02:13 grincheux systemd[1]: Starting Folding@Home V7 Client... juin 17 15:02:13 grincheux FAHClient[4837]: 13:02:13:Read GPUs.txt juin 17 15:02:13 grincheux systemd[1]: Started Folding@Home V7 Client. juin 17 15:02:14 grincheux systemd[1]: fahclient.service: Main process exited, code=dumped, status=11/SEGV juin 17 15:02:14 grincheux systemd[1]: fahclient.service: Failed with result 'core-dump'.
I removed all about FAH then installed again FAH. The problem is still there. FAHclient does not start. I assume there is a config.xml problem.
I remove the config.xml (added by the re-installation) by the config.xml.rpmsave. Result : fahclient does not run with the same error.
My config.xml 'I masked the key and the user : <config> <!-- Folding Slot Configuration --> <cause v='HIGH_PRIORITY'/> <!-- Network --> <proxy v=':8080'/> <!-- Slot Control --> <power v='FULL'/> <!-- User Information --> <passkey v='<key>'/> <team v='51'/> <user v='<user>'/> <!-- Folding Slots --> <slot id='0' type='CPU'> <paused v='true'/> </slot> <slot id='2' type='GPU'> <paused v='true'/> <pci-bus v='1'/> <pci-slot v='0'/> </slot> <slot id='1' type='GPU'> <gpu-beta v='True'/> <pci-bus v='0'/> <pci-slot v='2'/> </slot> </config>
(In reply to Bernhard Held from comment #26) > > :~> clinfo > > Number of platforms 1 > > Platform Name NVIDIA CUDA > > So, your PC is still invested by the NVIDIA driver. > > How did you install the NVIDIA driver? Using rpms or > NVIDIA-Linux-x86_64-xxx.yy.run? Get rid of the latter one. It disables other > drivers by removing/renaming a couple of files. > > I'd suggest to remove anything NVIDIA related until you've got a clean iGPU > setup. clinfo has to report a working Intel GPU. > > Finally, if everything is working including FAH, you might start to > reinstall NVIDIA drivers using rpms only. I did what you suggest. Everything failed. clinfo does find Intel OpenCL. FAH does not find the Intel OpenCL.
> juin 17 15:02:14 grincheux systemd[1]: fahclient.service: Main process exited, code=dumped, status=11/SEGV > juin 17 15:02:14 grincheux systemd[1]: fahclient.service: Failed with result 'core-dump'. > FAH does not find the Intel OpenCL. Does FAH crash or does it not find Intel OpenCL?
(In reply to Bernhard Held from comment #33) > > juin 17 15:02:14 grincheux systemd[1]: fahclient.service: Main process exited, code=dumped, status=11/SEGV > > juin 17 15:02:14 grincheux systemd[1]: fahclient.service: Failed with result 'core-dump'. > Problem solved, by replacing config.xml by config.xml saved with my settings. > > FAH does not find the Intel OpenCL. > > Does FAH crash or does it not find Intel OpenCL? FAH does not crash. just "07:59:51:WARNING:FS01:No CUDA or OpenCL 1.2+ support detected for GPU slot 01: gpu:0:2 KBL GT2 [HD Graphics 630]. Disabling."
> FAH does not crash. just "07:59:51:WARNING:FS01:No CUDA or OpenCL 1.2+ > support detected for GPU slot 01: gpu:0:2 KBL GT2 [HD Graphics 630]. > Disabling." OK. Last point: Please report the link chain starting from /usr/lib64/libOpenCL.so to check if it's restored again. As I wrote, I see the problem located in the ageing FAHClient. We (openSUSE and myself) can't help with this binary blob. You could ask for support at https://foldingforum.org/ Feel free to come back if there's a working setup with your iGPU.
/usr/lib64/libOpenCL.so -> /usr/lib64/libOpenCL.so.1 -> /etc/alternatives/libOpenCL.so.1 -> /usr/lib64/nvidia/libOpenCL.so.1
I don't understand why this was working well last year 2022 (November) when I explored this feature. I wrote this tutorial https://foldingforum.org/viewtopic.php?t=37544 In this tutorial in chapter 5, I wrote this (this is the proof there was no problem with FAH finding the Intel OpenCL) : ***************************************************************************** in case you have an error message in the log: 16:04:47:WARNING:FS01:Guessing ambiguous GPU to OpenCL device mapping for 01: gpu:0:2 KBL GT2 [HD Graphics 630]. Consider upgrading your graphics driver or manually setting ``opencl-index`` in this slot's configuration. You have 2 opencl softwares : one for the igpu and one for another card, in my case an nvidia card. For example 06:44:08: GPUs: 2 06:44:08: GPU 0: Bus:0 Slot:2 Func:0 INTEL:1 KBL GT2 [HD Graphics 630] 06:44:08: GPU 1: Bus:1 Slot:0 Func:0 NVIDIA:3 GK208B [GeForce GT 730] 692.7 06:44:08: CUDA Device 0: Platform:0 Device:0 Bus:1 Slot:0 Compute:3.5 Driver:11.4 06:44:08:OpenCL Device 0: Platform:0 Device:0 Bus:NA Slot:NA Compute:2.0 Driver:1.3 06:44:08:OpenCL Device 1: Platform:1 Device:0 Bus:1 Slot:0 Compute:3.0 Driver:470.86 I understand fah does not know what opencl to use for the intel igpu HD Graphics 630. It must be use the OpenCL Device 0 which is the opencl for intel igpu and not the OpenCL Device 1 which is for the nvidia card. ..... **************************************************************************** Something changed in tumbleweed.
(In reply to Episteme PROMENEUR from comment #36) > /etc/alternatives/libOpenCL.so.1 -> /usr/lib64/nvidia/libOpenCL.so.1 libOpenCL still points to the Nvidia lib. Use `update-alternatives` like shown above to fix this. Please double check the result.
(In reply to Bernhard Held from comment #38) > (In reply to Episteme PROMENEUR from comment #36) > > /etc/alternatives/libOpenCL.so.1 -> /usr/lib64/nvidia/libOpenCL.so.1 > > libOpenCL still points to the Nvidia lib. > > Use `update-alternatives` like shown above to fix this. > > Please double check the result. I already did this. It fails. see comment #25. If it point to nvidia opencl, this is because after all the tests, I set back to the original setup.
Please, all of us, answer to my comment #37. This comment is important in the definition of the problem.
(In reply to Episteme PROMENEUR from comment #39) > (In reply to Bernhard Held from comment #38) > > (In reply to Episteme PROMENEUR from comment #36) > > > /etc/alternatives/libOpenCL.so.1 -> /usr/lib64/nvidia/libOpenCL.so.1 > > > > libOpenCL still points to the Nvidia lib. > > > > Use `update-alternatives` like shown above to fix this. > > > > Please double check the result. > > I already did this. It fails. see comment #25. Are you kidding? You let me guess why your Intel GPU isn't working, in fact you're posting a link to the Nvidia lib. Moreover, you still didn't manage to answer my question by providing a correct link. You don't answer questions, instead you reply with your own questions. You let me think FAH is crashing, only after asking you let me know that there's an error message. You really have to provide a consistent and complete picture of your setup if you expect help Tumbleweed might be a bad choice if you need a stable API to run unmaintained 3rd party software. Use older Leap versions or Centos 6.3 (the binaries are built for it!). Ask for support at https://foldingforum.org/ or complain at folding@home. I bail out now.
I think everything is available to use Intel GPU for OpenCL despite of having nvidia drivers installed(intel-opencl, libOpenCL1, update-alternatives). If the Intel OpenCL driver is playing nicely together with every available software is a different story. Closing as fixed.
No it's not. When you launch "data center" and go to "opencl" then you see, there is only one platform : Nvidia. see the capture
Created attachment 873921 [details] only nvidia
For non-nvidia OpenCL you need to switch libOpenCL.so.1 by using update-alternatives. nvidia ships with its own. I've explained this before. Try this and set it to the non-nVidia libOpenCL.so.1 sudo update-alternatives --config libOpenCL.so.1
Hmm. Noresponse.