Bug 1191012

Summary: Electron apps crashes when using NVIDIA GPU
Product: [openSUSE] openSUSE Tumbleweed Reporter: Jebin S <jebin12raj>
Component: X11 3rd Party DriverAssignee: Stefan Dirsch <sndirsch>
Status: RESOLVED FIXED QA Contact: Stefan Dirsch <sndirsch>
Severity: Major    
Priority: P3 - Medium CC: devguy.ca, jebin12raj, t.rother, viktor.balogh2000
Version: Current   
Target Milestone: ---   
Hardware: Other   
OS: Other   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---

Description Jebin S 2021-09-28 06:12:22 UTC
Since the last update, electron apps crashes silently when launched. Apps like VSCode, Atom crashes without any output to console. Typora gave the following error.

 [10946:0928/091855.806300:FATAL:gpu_data_manager_impl_private.cc(415)] GPU process isn't usable. Goodbye. Trace/breakpoint trap (core dumped)

All these apps works fine in intel GPU. I've shared the system details below. Please let me know if any further details need to be provided. 


Driver Details:

Version: 470.74-44.1
Build Time: Tuesday 21 September 2021 06:32:02 PM
Install Time: Sunday 26 September 2021 11:15:02 PM
License: SUSE-NonFree
Installed Size: 107.3 MiB
Download Size: 0 B
Distribution: Proprietary:X11:Drivers / openSUSE_Tumbleweed
Vendor: obs://build.suse.de/Proprietary:X11:Drivers
Packager: 
Architecture: x86_64
Build Host: goat12
URL: https://www.nvidia.com/object/unix.html
Source Package: x11-video-nvidiaG05-470.74-44.1
Media No.: 0

System Information:

Operating System: openSUSE Tumbleweed 20210924
KDE Plasma Version: 5.22.5
KDE Frameworks Version: 5.86.0
Qt Version: 5.15.2
Kernel Version: 5.14.6-1-default (64-bit)
Graphics Platform: X11
Processors: 4 × Intel® Core™ i7-5500U CPU @ 2.40GHz
Memory: 7.7 GiB of RAMGraphics Processor: NVIDIA GeForce 840M/PCIe/SSE2
Comment 1 Thomas Rother 2021-09-28 08:11:03 UTC
confirmend;; I have the same issue with Atom and MS Teams for Linux. Atom shows:

thommie@odysseus3:~> atom /usr/bin/atom: Zeile 190: 23519 Ungültiger Maschinenbefehl   (Speicherabzug geschrieben) nohup "$ATOM_PATH" --executed-from="$(pwd)" --pid=$$ "$@" > "$ATOM_HOME/nohup.out" 2>&1
[23519:0928/100242.177930:FATAL:gpu_data_manager_impl_private.cc(439)] GPU process isn't usable. Goodbye.
 
Atom    : 1.58.0
Electron: 9.4.4
Chrome  : 83.0.4103.122
Node    : 12.14.1

odysseus3:/etc/alternatives # inxi -Fxz
System:    Kernel: 5.14.6-1-default x86_64 bits: 64 compiler: gcc v: 11.2.1 Console: tty pts/2 
           Distro: openSUSE Tumbleweed 20210926 
Machine:   Type: Laptop System: Dell product: Precision 7530 v: N/A serial: <filter> 
           Mobo: Dell model: 0XM3HC v: A07 serial: <filter> UEFI: Dell v: 1.12.1 date: 11/11/2019 
Battery:   ID-1: BAT0 charge: 32.1 Wh (100.0%) condition: 32.1/64.0 Wh (50.2%) volts: 8.3 min: 7.6 model: BYD DELL GHXKY8B 
           status: Full 
CPU:       Info: 6-Core model: Intel Core i7-8750H bits: 64 type: MT MCP arch: Kaby Lake note: check rev: A cache: 
           L2: 9 MiB 
           flags: avx avx2 lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx bogomips: 52799 
           Speed: 800 MHz min/max: 800/4100 MHz Core speeds (MHz): 1: 800 2: 800 3: 800 4: 800 5: 800 6: 800 7: 800 8: 800 
           9: 800 10: 800 11: 800 12: 800 
Graphics:  Device-1: NVIDIA GP107GLM [Quadro P2000 Mobile] vendor: Dell driver: nvidia v: 470.74 bus-ID: 01:00.0 
           Display: server: X.org 1.20.13 driver: loaded: nvidia unloaded: intel,modesetting tty: 127x25 
           Message: Advanced graphics data unavailable in console for root. 
Audio:     Device-1: Intel Cannon Lake PCH cAVS vendor: Dell driver: snd_hda_intel v: kernel bus-ID: 00:1f.3 
           Device-2: Sennheiser USB-ED CC 01 for MS type: USB driver: hid-generic,snd-usb-audio,usbhid bus-ID: 1-5.3.1.2:10 
           Device-3: Remo Tech OBSBOT Tiny type: USB driver: snd-usb-audio,uvcvideo bus-ID: 1-5.3.3:14 
           Sound Server-1: ALSA v: k5.14.6-1-default running: yes 
           Sound Server-2: JACK v: 1.9.18 running: no 
           Sound Server-3: PulseAudio v: 15.0 running: yes 
           Sound Server-4: PipeWire v: 0.3.37 running: yes 
Network:   Device-1: Intel Ethernet I219-LM vendor: Dell driver: e1000e v: kernel port: efa0 bus-ID: 00:1f.6 
           IF: em1 state: down mac: <filter> 
           Device-2: Intel Wireless-AC 9260 driver: iwlwifi v: kernel port: 3000 bus-ID: 6f:00.0 
           IF: wlp111s0 state: up mac: <filter> 
           Device-3: Realtek RTL8153 Gigabit Ethernet Adapter type: USB driver: r8152 bus-ID: 4-2.4:5 
           IF: enp57s0u2u4 state: down mac: <filter> 
           IF-ID-1: vpn0 state: up speed: 10 Mbps duplex: full mac: N/A 
           IF-ID-2: wg0 state: unknown speed: N/A duplex: N/A mac: N/A 
Bluetooth: Device-1: Intel Wireless-AC 9260 Bluetooth Adapter type: USB driver: btusb v: 0.8 bus-ID: 1-14:6 
           Report: ID: hci0 state: up address: <filter> 
Drives:    Local Storage: total: 1.38 TiB used: 628.73 GiB (44.6%) 
           ID-1: /dev/nvme0n1 vendor: Toshiba model: KXG5AZNV512G NVMe SED 512GB size: 476.94 GiB temp: 39 (312 Kelvin) C 
           ID-2: /dev/sda vendor: Western Digital model: WD10SPZX-00Z10T0 size: 931.51 GiB 
Partition: ID-1: / size: 150 GiB used: 98.61 GiB (65.7%) fs: btrfs dev: /dev/nvme0n1p2 
           ID-2: /boot/efi size: 499.7 MiB used: 5.1 MiB (1.0%) fs: vfat dev: /dev/nvme0n1p1 
           ID-3: /home size: 119.94 GiB used: 60.21 GiB (50.2%) fs: xfs dev: /dev/dm-0 mapped: cr-auto-1 
           ID-4: /opt size: 150 GiB used: 98.61 GiB (65.7%) fs: btrfs dev: /dev/nvme0n1p2 
           ID-5: /var size: 150 GiB used: 98.61 GiB (65.7%) fs: btrfs dev: /dev/nvme0n1p2 
Swap:      ID-1: swap-1 type: partition size: 36 GiB used: 0 KiB (0.0%) dev: /dev/nvme0n1p5 
Sensors:   System Temperatures: cpu: 61.0 C mobo: N/A 
           Fan Speeds (RPM): cpu: 2276 fan-2: 2282 
Info:      Processes: 421 Uptime: N/A Memory: 31.19 GiB used: 6.06 GiB (19.4%) Init: systemd runlevel: 5 Compilers: 
           gcc: 11.2.1 Packages: 3831 Shell: Bash v: 5.1.8 inxi: 3.3.03
Comment 2 Stefan Dirsch 2021-09-28 08:43:34 UTC
Hmm. Wasn't this a glibc update issue, which broke various Electron applications?
Comment 3 Thomas Rother 2021-09-28 08:56:01 UTC
In my case there was a glibc update yesterday:

odysseus3:/etc/alternatives # rpm -qa --last | grep glib
...
glibc-2.34-1.2.x86_64                         Mon Sep 27 22:12:09 2021

current version is

odysseus3:/etc/alternatives # zypper info glibc
Loading repository data...
Reading installed packages...

Information for package glibc:
------------------------------
Repository     : Haupt-Repository (OSS)
Name           : glibc
Version        : 2.34-1.2
Arch           : x86_64
Vendor         : openSUSE
Installed Size : 5.8 MiB
Installed      : Yes
Status         : up-to-date
Source package : glibc-2.34-1.2.src
Summary        : Standard Shared Libraries (from the GNU C Library)
Description    :
Comment 4 Stefan Dirsch 2021-09-28 09:01:07 UTC
There is a discussion on opensuse-factory mailing list with subject "new glibc from Tumbleweed snapshot 20210920 affects electron based apps"  (original subject: "New Tumbleweed snapshot 20210920 released!")

-----------------------------------------------------
today, after i installed recent updates, including the updates from snapshot
20210920, vscode did not work any more.

after some searching, i found
https://github.com/microsoft/vscode/issues/133804

so i worked around the problem with||||||||the "--no-sandbox" commandline
parameter.

then i noticed also other applications do not work any more.

i noticed, that all non-working applications are chromium/electron based
applications like slack, hamsket, vivaldi, discord, msteams.

all of them don't work any more, unless you use the "--no-sandbox" commandline
parameter.

after some more searching i found:
https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1944468

the reason for all this seems to be the new glibc, which was coming with
snapshot 20210920

is this already known?

in the above ubuntu issue a new/patched glibc is mentioned.

does such fix also exist for opensuse?
-----------------------------------------------------
The patch is for Electron to behave correct, not for glibc.

Some people hacked around it by disablling the new clone3() syscall,
but this is no solution. It's like buying a Porsche and then put a
very slow engine into it.

If you speak about openSUSE packages like docker: yes, we fixed the
applications we were aware of. But we cannot fix e.g. third party
binaries and openQA does not cover all applications
-----------------------------------------------------
this means, all Electron based applications need to be fixed.

and as long as such an application has not been fixed, the "--no-sandbox"
workaround has to be used with this application.
-----------------------------------------------------
Comment 5 Thomas Rother 2021-09-28 09:14:25 UTC
the --no-sandbox switch is not only a workaround for vscode, but also helps for ms teams for linux and atom
Comment 6 Stefan Dirsch 2021-09-28 09:15:53 UTC
So is this a duplicate of boo#1190830 ? But then I don't get why this works on systems with intel GPU ...
Comment 7 Stefan Dirsch 2021-09-28 09:17:07 UTC
(In reply to Thomas Rother from comment #5)
> the --no-sandbox switch is not only a workaround for vscode, but also helps
> for ms teams for linux and atom

Thanks for feedback, Thomas! You're also using nvidia 470.74 driver, right?
Comment 8 Stefan Dirsch 2021-09-28 09:18:30 UTC
@Jebin Please verify, that you're using the same glibc version on your system(s) with Intel GPU. Thanks!
Comment 9 Thomas Rother 2021-09-28 10:16:36 UTC
(In reply to Stefan Dirsch from comment #7)

> Thanks for feedback, Thomas! You're also using nvidia 470.74 driver, right?

470.74, yes.
Comment 10 Jebin S 2021-09-28 10:33:37 UTC
(In reply to Stefan Dirsch from comment #8)
> @Jebin Please verify, that you're using the same glibc version on your
> system(s) with Intel GPU. Thanks!

Yes! It is the same version. Its the same system in which I had switched from NVidia to Intel GPU using prime-select.


glibc - Standard Shared Libraries (from the GNU C Library)

Version: 2.34-1.2
Build Time: Saturday 18 September 2021 01:11:29 PM
Vendor: openSUSE
Packager: https://bugs.opensuse.org
Architecture: x86_64
Build Host: cloud113
URL: http://www.gnu.org/software/libc/libc.html
Source Package: glibc-2.34-1.2
Comment 11 Stefan Dirsch 2021-09-28 11:54:42 UTC
Hmm. So the issue can be workarounded by either

a) switching to intel driver

or

b) using "--no-sandbox" option for electron programs

OTOH the issue is known to users, who definitely are not using NVIDIA drivers. Weird.
Comment 12 Stefan Dirsch 2021-09-28 12:02:11 UTC
@Jebin I suggest you also try with "--no-sandbox" with active nvidia drivers as a workaround for now.
Comment 13 Jebin S 2021-09-28 12:43:57 UTC
(In reply to Stefan Dirsch from comment #12)
> @Jebin I suggest you also try with "--no-sandbox" with active nvidia drivers
> as a workaround for now.

Thank you. That works :)
Comment 14 Stefan Dirsch 2021-09-28 13:39:43 UTC
@Jebin Thanks for the feedback!
Comment 16 Thomas Rother 2021-09-28 14:15:10 UTC
element-desktop crashes too, even with --no-sandbox, but in this case I guess its something linked to libva ???

thommie@odysseus3:~> element-desktop --no-sandbox
/home/thommie/.config/Element exists: yes
/home/thommie/.config/Riot exists: no
Starting auto update with base URL: https://packages.element.io/desktop/update/
Auto update not supported on this platform
Fetching translation json for locale: en_EN
Changing application language to de
Fetching translation json for locale: de
Resetting the UI components after locale change
Resetting the UI components after locale change
DRM_IOCTL_I915_GEM_APERTURE failed: Das Argument ist ungültig
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [22]
param: 4, val: 0
libva error: /usr/lib64/dri/iHD_drv_video.so init failed
DRM_IOCTL_I915_GEM_APERTURE failed: Das Argument ist ungültig
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [22]
param: 4, val: 0
libva error: /usr/lib64/dri/iHD_drv_video.so init failed
DRM_IOCTL_I915_GEM_APERTURE failed: Das Argument ist ungültig
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [22]
param: 4, val: 0
libva error: /usr/lib64/dri/iHD_drv_video.so init failed
[9187:0928/140525.531131:FATAL:gpu_data_manager_impl_private.cc(415)] GPU process isn't usable. Goodbye.
/usr/bin/element-desktop: Zeile 3:  9187 Trace/Breakpoint ausgelöst   (Speicherabzug geschrieben) electron /usr/lib/element/app.asar "$@"
Comment 17 Stefan Dirsch 2021-09-28 15:17:15 UTC
No idea. And there is still this

  FATAL:gpu_data_manager_impl_private.cc(415)]
Comment 18 Thomas Rother 2021-10-04 05:17:23 UTC
Are there any updates on this? The "--no-sandbox" workaround still works and I also saw some updates for npm, but the root cause seems not to be found yet ;-(.
Comment 19 Stefan Dirsch 2021-10-04 08:13:55 UTC
Hmm. boo#1190830 has been closed. What does this mean.? I'm not sure. Probably that openSUSE's electron package has been fixed. Will this help for your 3rd party apps? I'm afraid it does not assuming they ship their own electron component. :-(
Comment 20 Thomas Rother 2021-10-08 07:00:57 UTC
Status after update to

locutus:/etc # lsb_release -r
Release:        20211005

atom, MS teams and VS code still need the --no-sandbox switch as a workaround to get started. element-desktop crashes completely, even with that switch, but this looks like some dependency issue ...

thommie@locutus:~> element-desktop --no-sandbox
Seshat unexpected error: Error: libsqlcipher-3.34.1.so.0: Kann die Shared-Object-Datei nicht öffnen: Datei oder Verzeichnis nicht gefunden
    at process.func [as dlopen] (electron/js2c/asar_bundle.js:5:1846)
    at Object.Module._extensions..node (internal/modules/cjs/loader.js:1138:18)
    at Object.func [as .node] (electron/js2c/asar_bundle.js:5:2073)
    at Module.load (internal/modules/cjs/loader.js:935:32)
    at Module._load (internal/modules/cjs/loader.js:776:14)
    at Function.f._load (electron/js2c/asar_bundle.js:5:12913)
    at Module.require (internal/modules/cjs/loader.js:959:19)
    at require (internal/modules/cjs/helpers.js:88:18)
    at Object.<anonymous> (/usr/lib/element/app.asar/node_modules/matrix-seshat/lib/index.js:16:22)
    at Module._compile (internal/modules/cjs/loader.js:1078:30)
Comment 21 Stefan Dirsch 2021-10-08 10:08:53 UTC
Thanks for the update. Yes, the element-desktop issue appears to be unrelated. This is neither a glibc nor a nvidia driver issue.

About atom, MS teams and VS code. I just read that teams-1.4.00.26453 would have fixed the issue. This appears to support my assumption, that these come with their own Nodejs implementation (and nodejs electron framework?).
Comment 22 Stefan Dirsch 2021-10-17 17:17:53 UTC
*** Bug 1190836 has been marked as a duplicate of this bug. ***
Comment 23 Stefan Dirsch 2021-10-17 21:17:42 UTC
(In reply to Stefan Dirsch from comment #17)
> No idea. And there is still this
> 
>   FATAL:gpu_data_manager_impl_private.cc(415)]

Possibly it helps to start the electron based app with the following flag "-in-process-gpu". It may avoid a crash due to a GPU bug.
Comment 24 Stefan Dirsch 2021-10-17 21:56:31 UTC
(In reply to Stefan Dirsch from comment #23)
> (In reply to Stefan Dirsch from comment #17)
> > No idea. And there is still this
> > 
> >   FATAL:gpu_data_manager_impl_private.cc(415)]
> 
> Possibly it helps to start the electron based app with the following flag
> "-in-process-gpu". It may avoid a crash due to a GPU bug.

@Thomas Rother  @Jebin S  Could you give this a try, please?
Comment 25 Dev Guy 2021-10-18 00:23:22 UTC
It should be 2 leading dashes not one, like, "--in-process-gpu"

For example this is how I am calling Discord which is still using the old buggy GPU electron.

/opt/local/apps/Discord/Discord --in-process-gpu
Comment 26 Thomas Rother 2021-10-18 05:39:30 UTC
just tested

- current element-desktop 1.8.4-1.2 now start again without any switches needed (dependency error from #20 is gone)
- MS teams 1.4.00.26453-1 also starts without switches
- code 1.59.0-1628120127.el8 still needs --no-sandbox
- same for atom 1.58.0-0.1, it only starts with  --no-sandbox

Tumbleweed version:
locutus:~ # lsb_release -r
Release:        20211012
Comment 27 Thomas Rother 2021-10-18 05:45:28 UTC
Just for my understanding:

This is an error in the electron framework, correct?
The GPU error was already fixed in electron, correct?
Some apps embed this fraemwork, others not.
Those that embed it may still use older version (>>> atom, VS code)
Comment 28 Stefan Dirsch 2021-10-18 08:14:42 UTC
(In reply to Thomas Rother from comment #27)
> Just for my understanding:
> 
> This is an error in the electron framework, correct?

I thought so, but it was claimed in the beginning the issue would only occur with nvidia driver, not with Intel. 

> The GPU error was already fixed in electron, correct?

You mean "FATAL:gpu_data_manager_impl_private.cc(415)]"? At least in element-desktop it seems. Others still need
"--in-process-gpu" ?

> Some apps embed this framework, others not.

Yes.

> Those that embed it may still use older version (>>> atom, VS code)

Yes.
Comment 29 Stefan Dirsch 2021-10-22 14:55:14 UTC
(In reply to Stefan Dirsch from comment #24)
> (In reply to Stefan Dirsch from comment #23)
> > (In reply to Stefan Dirsch from comment #17)
> > > No idea. And there is still this
> > > 
> > >   FATAL:gpu_data_manager_impl_private.cc(415)]
> > 
> > Possibly it helps to start the electron based app with the following flag
> > "-in-process-gpu". It may avoid a crash due to a GPU bug.
> 
> @Thomas Rother  @Jebin S  Could you give this a try, please?

@Jebin, could you please provide feedback here? Thanks!
Comment 30 Stefan Dirsch 2021-10-25 10:17:59 UTC
For some reason Jebin cannot access this ticket any longer. I'm adding his comment, which he sent me via private mail.

Date: Mon, 25 Oct 2021 11:40:36 +0530
From: Jebin Tony Raj <jebin12raj@gmail.com>
To: sndirsch@suse.com
Subject: bug 119012 - Electron apps crashes when using NVIDIA GPU

Hello Stefan Dirsch,

https://bugzilla.suse.com/show_bug.cgi?id=1191012

I am not able to access this bug since it was moved to suse bugzilla. I don't
face this issue anymore since the electron apps were updated including teams,
vscode. Thank you for your support!

Thanks and regards,

Jebin S
Comment 31 Stefan Dirsch 2021-10-25 10:20:45 UTC
I think with that we can close this bug. If you don't think so, please speak up, so I can reopen and try to address (or at least track here) the reamining issue(s). Thanks for all the input!
Comment 32 Thomas Rother 2021-10-25 18:25:03 UTC
I agree, most apps obviously upgraded their electron codebase now. For atom, the --no-sandbox switch is still required. But they also have an open issue for this: https://github.com/atom/atom/issues/23036