Bug 1170696 - Dell Latitude 7490 freezes with 15.2
Dell Latitude 7490 freezes with 15.2
Status: NEW
Classification: openSUSE
Product: openSUSE Distribution
Classification: openSUSE
Component: Kernel
Leap 15.2
Other Other
: P2 - High : Major (vote)
: ---
Assigned To: openSUSE Kernel Bugs
E-mail List
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2020-04-28 13:31 UTC by Adrian Schröter
Modified: 2021-05-14 06:23 UTC (History)
7 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments
hwinfo from Dell Latitude 7490 (358.10 KB, application/gzip)
2020-04-28 14:32 UTC, Adrian Schröter
Details
picture of ioremap error report. The drawing errors were on screen. (1.87 MB, image/jpeg)
2020-09-27 13:12 UTC, Adrian Schröter
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Adrian Schröter 2020-04-28 13:31:58 UTC
My standard SUSE company notebook (Dell Latitude 7490) is freezing since 15.2.

Since I was unable to get any traces or crash dumps yet, I re-installed 15.1 to verify the hardware is not broken. The problem is back after updating to 15.2 again (just keeping always my /home partition).

The freeze happens also a few minutes after booting in runlevel 3.

No kernel crash panic LED and of course no dump in systemd. 

I would try to give more details, but I am bit clueless atm how to get more data.
Comment 1 Takashi Iwai 2020-04-28 14:01:48 UTC
No ssh access available after it freezes, too?

Could you give hwinfo output?  Our team hasn't received this company standard laptop, so we have no idea what hardware it is at all.
Comment 2 Adrian Schröter 2020-04-28 14:32:44 UTC
Created attachment 836975 [details]
hwinfo from Dell Latitude 7490
Comment 3 Adrian Schröter 2020-04-28 14:41:48 UTC
No ssh, no ping and even the cursor stops blinking on the linux console.

I see this crash either seconds or 2 minutes after booting or sometimes it can take 1 hour. No idea how to trigger it.

For some time I suspected the wireless card, so I removed it and used an USB wireless adapter, but it remains crashing.
Comment 4 Takashi Iwai 2020-04-28 15:11:21 UTC
Thanks.

Could you install openSUSE Leap 15.1 kernel and OBS Kernel:stable kernel on top of the existing Leap 15.2 system and see whether they can trigger the similar problem?
Comment 5 Adrian Schröter 2020-04-28 15:27:32 UTC
Leap 15.1 (Kernel 4.12.14-lp151.28.48) seems to be stable at first glance. But I keep waiting a bit more ...

Btw, I do test it atm by not touching the notebook because it looks relative reproducible via that. So any kind of suspend handling might have an influence. 

But I remember the freezes also while actively working on it.
Comment 6 Adrian Schröter 2020-04-28 16:05:14 UTC
Unable to boot Kernel:stable atm, created a cert.der based on sslcert from Kernel:stable in /boot/efi/ but grub still refuses to load it
Comment 7 Takashi Iwai 2020-05-03 07:46:24 UTC
Hrm, you can try the official TW kernel instead of Kernel:stable, then.

Looking at hwinfo, it's a Kabylake.  Stefan, is it the machine you took?
Comment 8 Stefan Dirsch 2020-05-03 08:08:31 UTC
No. I took the Dell Precision 5520 with Intel/nvidia combo.
Comment 9 Lubos Kocman 2020-05-06 10:54:49 UTC
Precission with 5530 has no issues, I had some freeze experience with previous Leap kernel, but it got fixed with update. 

lkocman@linux:~/Workspace/sles/release-notes-sles> uname -r
5.3.18-lp152.11-default
Comment 10 Lubos Kocman 2020-06-03 08:14:04 UTC
Any update on the issue? Can this still be reproduced?
Comment 11 Adrian Schröter 2020-06-04 11:45:02 UTC
yes, still freezing with that kernel. Leap 15.1 kernel still stable.

So no change for me.
Comment 12 Lubos Kocman 2020-06-25 09:28:34 UTC
Reducing priority to P2 as one of criteria is no P1s and this is the only P1 we have. I can see that this will be a hardware specific issue and not a generic crash.

Please make sure to submit fix as a maintenance update to Leap 15.2

Thank you
Comment 13 Adrian Schröter 2020-08-14 12:30:04 UTC
Would it be helpfull if I give you my notebook?

I can not really using it anyway atm .... (or only with 15.1 kernel, but that causes some other problems).

I would also exchange it permanently against a comparable if wanted :)
Comment 14 Takashi Iwai 2020-08-14 13:35:20 UTC
(In reply to Adrian Schröter from comment #13)
> Would it be helpfull if I give you my notebook?
> 
> I can not really using it anyway atm .... (or only with 15.1 kernel, but
> that causes some other problems).
> 
> I would also exchange it permanently against a comparable if wanted :)

Could you try the latest Leap 15.2 kernel in OBS Kernel:openSUSE-15.2 repo?
I've backported quite a few i915 patches that caused lockups on other machines, and this supposedly worked on others.

If the problem still persists, we might consider taking over the machine.
Comment 15 Adrian Schröter 2020-08-17 06:33:52 UTC
Using

kernel-default-5.3.18-lp152.100.1.g74fea2c.x86_64 

from there still running into a freeze a few seconds after boot.

Still trying to get a trace using this kernel, but little hope ...
Comment 16 Adrian Schröter 2020-09-27 13:10:18 UTC
plenty of crashes later on 5.9.0-rc2-2.g4229f31-default I saw once a screen with a partly ioremap error dump. A "screenshot" is attached.

Note, the system runs stable sometimes, but suddenly is crashing for some time. The system is nearly unable to boot then for a longer time. switching it off and removing power plug for 10 minutes helps often.
Comment 17 Adrian Schröter 2020-09-27 13:12:48 UTC
Created attachment 841987 [details]
picture of ioremap error report. The drawing errors were on screen.
Comment 18 Adrian Schröter 2020-11-13 08:02:12 UTC
kernel-default-5.10.rc3-1.1.ge72caa5.x86_64 from Kernel:Head project changed the behaviour slightly in the way the crash/freeze happens earlier.

I also updated the BIOS firmware to latest version from last month btw.

I boot the system in runlevel 3 meanwhile for testing. It is crashing immediatly after initializing the framebuffer now (last line is

i915 0000:00:02.0: [drm] fb0: i915drmfb frame buffer device

(typed manually, but checked for typos).

Afterwards the crash LED is blinking for a few seconds until it stops and the entire system keeps frozen.

I did not found any way yet to get a kernel trace in this situation. Can you suggest any way?

The linux kernel 4.x from openSUSE 15.1 is still stable on that 15.2 installation. (except that it seems to have suspend issues with that system and I need to boot first a 5.x kernel after power off, wait for the crash before I boot the 4.x one. Otherwise the wlan has a high chance of not being initialized).

I would really appreciate in any help to get my main workstation notebook in a usable state.
Comment 19 Adrian Schröter 2020-11-13 08:02:39 UTC
it is our official company notebook model for some time after all ...
Comment 20 Adrian Schröter 2020-11-13 09:22:16 UTC
after some more googling, can you confirm that this is most likely a long standing issue with intel firmware? Unlikely to get solved?

I would request new hardware then...


https://linuxreviews.org/Linux_Kernel_5.5_Will_Not_Fix_The_Frequent_Intel_GPU_Hangs_In_Recent_Kernels
Comment 21 Takashi Iwai 2020-11-16 13:12:39 UTC
Honestly speaking, it's hard to judge because of the lack of the crash information.  But this looks indeed like the same problem.

Did you try to exclude the firmware as pointed there?  You'd need to rebuild initrd after the removal.

It's unfortunate that our team (hardware-enablement) hasn't received the recent company-standard laptop for verification at all.  The last one we received was some Skylake models in a few years ago.  An employee may receive something for the workstation or the laptop for daily works, but not for the evaluation and the debugging purpose for the team.  (And of course we're not involved with the decision which model to choose, either.)
Comment 22 Adrian Schröter 2020-11-16 13:28:45 UTC
removing i915/kbl_dmc_ver1_04.bin and recreating the firmware is indeed solving the issue. At least I was able to reach even this web browser window.

Let's see if anything else breaks over time.

(/me is punishing himself for not trying that before ... thought the screen will stay black without firmware)
Comment 23 Patrik Jakobsson 2020-11-16 19:46:13 UTC
If the DMC is causing issues you can also try the different i915.enable_dc module parameter settings (with the DMC firmware loaded). i915.enable_dc=0 should be equivalent to not loading the firmware (at least regarding the power states).

-Patrik
Comment 24 Adrian Schröter 2020-11-17 07:51:50 UTC
that parameter seems to help also. At least stable for almost one minute already :)
Comment 25 Adrian Schröter 2020-11-17 07:52:21 UTC
(with firmware file restored and re-created initrd of course)