Bug 1213438

Summary: "green screen" after kernel-update to vmlinuz-5.14.21-150500.55.7-default
Product: [openSUSE] openSUSE Distribution Reporter: Günter Halt <guenter.halt>
Component: KernelAssignee: openSUSE Kernel Bugs <kernel-bugs>
Status: RESOLVED FIXED QA Contact: E-mail List <qa-bugs>
Severity: Critical    
Priority: P5 - None CC: aj, guenter.halt, tiwai
Version: Leap 15.5   
Target Milestone: ---   
Hardware: Other   
OS: Other   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: hwinfo OS=openSuSELeap15.4
dmesg of PC OS=openSuSELeap15,4
ubname -a OS=openSuSELeap15.4
dmesg-clear+crash+dmesg
dmseg-report-after-crash
see filename
Report ps -Af before crash

Description Günter Halt 2023-07-18 20:19:29 UTC
after kernel-update and other packages from 18.07.2023 the system boots.
The display-manger shows the list of users and without action the Display is green only (like blue screen of windows).
The whole computer is frozen. No reaktion of <ctrl><alt><delete> , no reaktion of <ctrl><alt><F1> 

On the same PC a second installation of leap15.5 is installed.
I copied the complete directory  /boot from the 2.system to my work-system.
Now it works, however all private services are removed from /etc/systemd/system.

vmlinux  is a link to vmlinuz-5.14.21-150500.55.7-default
initrd a link to  initrd-5.14.21-150500.55.7-default

However uname -r showes 5.14.21-150500.53-default
not  5.14.21-150500.55  ????
Comment 1 Lukas Ocilka 2023-07-19 06:27:33 UTC
Günter, thanks for the bugreport, but please read https://en.opensuse.org/openSUSE:How_to_Write_a_Good_Bugreport

For the first sight, there is no real reason why this should be assigned to AutoYaST component - IMO looks like a mistake. The only information about what and how you were doing is "after kernel-update".
Comment 2 Takashi Iwai 2023-07-19 07:30:40 UTC
If it's an update, you must have an entry of the old kernel.  Please boot with it and verify that it's a kernel regression.

If so, please give the hwinfo and the dmesg outputs from the old good-working kernel, and attach them to Bugzilla.
Comment 3 Günter Halt 2023-07-19 08:58:31 UTC
1. please forget this info: "...  all private services are removed from /etc/systemd/system." 
After execute announced update in the 2. Installation of leap15.5 (without kernel-update) the order in grub was changed and the 2. Installation was on first position, in this Installation my private services was not installed.
Sorry!

2. Now both Installations of 15.5 can not be used. 
My last action was to start the announced update .

Maybe the kernel is not the reason. 
If i press <ctrl><alt><F1> immediately after (may be before) raise the graphical Login-menu, i can login on console.


vmlinuz-5.14.21-150500.55.7-default
 
On second leap15.5 before update (kenel-update susped) if found 
vmlinuz ->  vmlinuz-5.14.21-150500.53-default

After reboot 
vmlinuz ->  vmlinuz-5.14.21-150500.55.7-default

Why? 

" If so, please give the hwinfo and the dmesg outputs from the old good-working kernel, and attach them to Bugzilla." 

No system is running with the old kernel. 

I will install a 3. leap15.5 from USB-Stick and try to reproduce it.

Remarks: since any days during boot the info "AMD-VI Firmware Bug OAPIC{1] not in IVRS table
Comment 4 Takashi Iwai 2023-07-19 09:43:00 UTC
Somehow you screwed up the boot partitions by two parallel installations, as it seems.  Maybe it's better not to install more, but rather have a single clean one.
Comment 5 Günter Halt 2023-07-19 12:14:20 UTC
"Maybe it's better not to install more, but rather have a single clean one."

o.k. My strategy is to have a installation to work, if a new release is present to install this on an other partition. Before the new release is the work-release i make adaptions (own services) , install other packages ....
The old is still present. 
So i have leap15.4  still. 
This report i write with leap15.4 

Both installations leap15.5 are unworkably after execute the announced updates.

For both the basic-iso was openSUSE-Leap-15.5-DVD-x86_64-Build484.1-Media.iso

Now i installed, using the same iso, to an other hdd on the same computer.

The same result. After "starting local service" the Displaymanger shows the user-name and -> crash (like command halt ), the Monitor is green only.

I have no idea to cleen this. 
I will try to login on konsole before crash (it is not so easy )
The difference is the online-repositories contains other packages.
If i repeat the 3. installation, i will install only from Stick.
Sorry for my English. I never learned it in the school and now i am 78 years old.
Thank you.
Comment 6 Günter Halt 2023-07-19 13:34:36 UTC
3 file as attachment
1. hwinfo
2. dmesg
3. uname -a

gut running System leap15.4
Comment 7 Günter Halt 2023-07-19 13:37:14 UTC
Created attachment 868322 [details]
hwinfo OS=openSuSELeap15.4
Comment 8 Günter Halt 2023-07-19 13:39:01 UTC
Created attachment 868323 [details]
dmesg of PC OS=openSuSELeap15,4
Comment 9 Günter Halt 2023-07-19 13:40:35 UTC
Created attachment 868324 [details]
ubname -a OS=openSuSELeap15.4
Comment 10 Günter Halt 2023-07-20 19:14:51 UTC
New experiments:
1. installation leap15.5
    using openSUSE-Leap-15.5-DVD-x86_64-Build484.1-Media.iso
    no online Repositories 
    xfce as windowmanager
    is works fine, no crash. I need Tcl/Tk  

2.  installation leap15.5 (the same ISO ) 
    include  online Repositories 
    choice: C++ , Tcl/Tk          
    xfce as windowmanager
    the old Home-Partition is mounted, my home is prepared whit any Tcl/Tk-tools
    any programs will be started by xfce-autostart.
      
The displaymanager shows the User.
    After "Password" + Enter -> "green screen"     - dead 
    only power off or reset possible

Before login-try (known user), i can start a console (<ctrl><alt><F1>) and create a new user-account (minimal, no autostart-programms.

The system crash , independent from user-login.  

On the Displaymanager the are possible to select icewm or xfce.

Is icewm is selected login is possible, select xfce -> crash.

 
My be, the bad package will be taken from a online repository
Comment 11 Günter Halt 2023-07-20 19:21:21 UTC
Created attachment 868349 [details]
dmesg-clear+crash+dmesg
Comment 12 Takashi Iwai 2023-07-21 06:04:42 UTC
I guess this could be the same bug of i915 I fixed yesterday in bug 1213493.
Could you try the kernel in OBS home:tiwai:bsc1213493 repo?
  http://download.opensuse.org/repositories/home:/tiwai:/bsc1213493/pool/
for Leap 15.5?
Comment 13 Günter Halt 2023-07-21 06:56:23 UTC
o.k. i try it.
please give me a guide to install this kernel-patch
1. which rpm is true for my AMD-Athlon-CPU
2. rpm <downloaded-rpm-package> ? 
3. ??  
4. ??
Comment 14 Takashi Iwai 2023-07-21 07:03:08 UTC
Err, sorry, I was confused as if yours were with i915 graphics.  If it's amdgpu, then this must be a different problem.

Still it might be worth to try, as it's built from the latest SLE15-SP5 kernel git branch.

In general, a package from an OBS project can be downloaded from
  http://download.opensuse.org/repositories/...
For OBS home:tiwai:bscbsc1213493, it's
  http://download.opensuse.org/repositories/home:/tiwai:/bsc1213493/
(note the slash added around comma, i.e. ":" will be ":/")

Under subdirectory "pool/x86_64", you have kernel-default-$VERSION.rpm, kernel-default-extra-$VERSION.rpm and kernel-default-optional-$VERSION.rpm.
Download those files, and install them via zypper install.  You might need to pass --oldpackage option to zypper install, too.

And reboot, choose the kernel in GRUB menu, and boot with it / retest.
Comment 15 Günter Halt 2023-07-21 08:49:23 UTC
Sorry, kernel-install failed. My error!


zypper install kernel-default-5.14.21-150500.1.1.g8d27c97.x86_64.rpm
zypper install kernel-default-extra-5.14.21-150500.1.1.g8d27c97.x86_64.rpm 
zypper install kernel-default-optional-5.14.21-150500.1.1.g8d27c97.x86_64.rpm 

1. are this the required packages?
2. how to handle  "force-install"  by --oldpackage 
   without packagename for oldpackage? 

3.  choose the kernel in GRUB menu
    edit /boot/grub2/grub.cfg  ? 
    The are a command for it ?
    In yast -> bootloader i found nothing
Comment 16 Takashi Iwai 2023-07-21 08:59:41 UTC
You need to pass --oldpackage to zypper install commands.  And install all three at once.  That is,
  zypper install --oldpackage kernel-default-5.14.21-150500.1.1.g8d27c97.x86_64.rpm kernel-default-extra-5.14.21-150500.1.1.g8d27c97.x86_64.rpm kernel-default-optional-5.14.21-150500.1.1.g8d27c97.x86_64.rpm 

For choosing a boot kernel at GRUB, choose "Advanced options..." by the cursor key, press RETURN.  Then you can choose a kernel to boot.  Select the right one (it may appear at the bottom) *WITHOUT* recovery mode, and press RETURN.
Comment 17 Günter Halt 2023-07-21 11:14:16 UTC
now i think, it runs with he new kernel
uname -a : Linux hlt1-e 5.14.21-150500.1.g8d27c97-default #1 SMP PREEMPT_DYNAMIC Thu Jul 20 06:39:51 UTC 2023 (8d27c97) x86_64 x86_64 x86_64 GNU/Linux

ls -l /boot :

 initrd -> initrd-5.14.21-150500.1.g8d27c97-default
...
 vmlinuz -> vmlinuz-5.14.21-150500.1.g8d27c97-default

...........

The crash-problem is not solved.
On the test-Installation login, using icewm  is possible  
Login using xfce -> crash (completely down, "green screen" )
Comment 18 Takashi Iwai 2023-07-21 12:19:14 UTC
OK, thanks.

The link of /boot/vmlinuz and /boot/initrd can be forgotten, as this isn't really used.  Just check whether both the corresponding /boot/vmlinuz-$VERSION and /boot/initrd-$VERSION are present.

I'm building yet another kernel with more graphics fix backports.  The package is being built in OBS home:tiwai:bsc1213438 repo.  Once after the build finishes (takes an hour or so), the packages will be available at
  http://download.opensuse.org/repositories/home:/tiwai:/bsc1213438/pool/

Please give it a try, too.  You can uninstall the previous test kernels by "zypper remove" with the exact version.  That is,

   zypper rm kernel-default-5.14.21-150500.1.1.g8d27c97

and it'll remove the given kernel (and *-extra and *-optional are removed together due to dependency).

Last but not least, when you test more kernel packages, it's safer to increase the number of installable kernels on your system beforehand.  Edit /etc/zypp/zypp.conf as root, and modify the line defining multiversion.kernels to add more items, e.g.

  multiversion.kernels = latest,latest-1,latest-2,latest-3,running

In the example above, it can keep up to 4+1 kernels simultaneously.
Comment 19 Andreas Jaeger 2023-07-22 16:05:41 UTC
Takashi, just tried your test kernel after the system would not boot up with kernel-default-5.14.21-150500.55.7.1.x86_64, it seemed to crash while starting X server without any chance for me to grab any debug info.
With your new kernel (kernel-default-5.14.21-150500.3.1.g62ee467) everything works fine.

Since the released kernel breaks several amdgpu systems, I propose to release an update quickly and not wait for the August update.

# hwinfo --gfxcard
31: PCI 500.0: 0300 VGA compatible controller (VGA)             
  [Created at pci.386]
  Unique ID: Ddhb.uZbpCsxmrO5
  Parent ID: JZZT.nyyq4tDu6x8
  SysFS ID: /devices/pci0000:00/0000:00:08.1/0000:05:00.0
  SysFS BusID: 0000:05:00.0
  Hardware Class: graphics card
  Model: "ATI Picasso"
  Vendor: pci 0x1002 "ATI Technologies Inc"
  Device: pci 0x15d8 "Picasso"
  SubVendor: pci 0x17aa "Lenovo"
  SubDevice: pci 0x5127 
  Revision: 0xd1
  Driver: "amdgpu"
  Driver Modules: "amdgpu"
  Memory Range: 0xc0000000-0xcfffffff (ro,non-prefetchable)
  Memory Range: 0xd0000000-0xd01fffff (ro,non-prefetchable)
  I/O Ports: 0x1000-0x1fff (rw)
  Memory Range: 0xd0500000-0xd057ffff (rw,non-prefetchable)
  IRQ: 50 (no events)
  Module Alias: "pci:v00001002d000015D8sv000017AAsd00005127bc03sc00i00"
  Driver Info #0:
    Driver Status: amdgpu is active
    Driver Activation Cmd: "modprobe amdgpu"
  Config Status: cfg=new, avail=yes, need=no, active=unknown
  Attached to: #25 (PCI bridge)

Primary display adapter: #31
Comment 20 Takashi Iwai 2023-07-22 16:48:10 UTC
OK, good to hear.  And yes, my fix backports have been already merged to SLE15-SP5 git branch for the next update.
Comment 22 Andreas Jaeger 2023-07-23 10:16:58 UTC
Sorry, booted into the wrong kernel. So, I cannot confirm that the updated kernel fixes my problem. Before booting in the correct kernel, need to fix keys for booting. I'll try to do tomorrow.
Comment 23 Andreas Jaeger 2023-07-23 10:29:26 UTC
Now, booted the right kernel - and it booted up fine but as soon as I connected my two external monitors, it crashed ;(
Comment 24 Takashi Iwai 2023-07-23 10:41:01 UTC
(In reply to Andreas Jaeger from comment #23)
> Now, booted the right kernel - and it booted up fine but as soon as I
> connected my two external monitors, it crashed ;(

Is it a kernel crash?  Or only about graphics / desktop?
Comment 25 Günter Halt 2023-07-23 16:59:07 UTC
"Is it a kernel crash?  Or only about graphics / desktop?"

Is it no kernel crash!!!

I tried a ssh-connection from my laptop to leap15.5-PC
This connection is stable after crash.

The X-server is running and after any time the green-Display  will be black and, if the mouse is moved, green again. Sorry this behaviour was new for me.

Pleas show the attachment  "dmseg-report-after-crash"  , made on ssh-session.
Comment 26 Günter Halt 2023-07-23 17:00:38 UTC
Created attachment 868384 [details]
dmseg-report-after-crash
Comment 27 Takashi Iwai 2023-07-24 06:22:55 UTC
(In reply to Günter Halt from comment #25)
> "Is it a kernel crash?  Or only about graphics / desktop?"
> 
> Is it no kernel crash!!!

It was a question rather to Andreas.  His problem might be different from yours.

> The X-server is running and after any time the green-Display  will be black
> and, if the mouse is moved, green again. Sorry this behaviour was new for me.
> 
> Pleas show the attachment  "dmseg-report-after-crash"  , made on ssh-session.

This doesn't look like a result from the test kernel mentioned in comment 18.
Have you tested with it?
Comment 28 Andreas Jaeger 2023-07-24 06:50:03 UTC
Takashi, let's discuss my problem in https://bugzilla.suse.com/show_bug.cgi?id=1213578 .
Comment 29 Günter Halt 2023-07-27 05:53:57 UTC
hi, when can i expect a patch or other solution for crash of leap15.5 on PC
whit   AMD A10-9700 RADEON R7 ?


The crash of 15.5 happens after recommended online update.

I am not sure,  can i install the recommended updates in leap15.4 ? 
15.4 runs stable.
Comment 30 Takashi Iwai 2023-07-27 06:10:38 UTC
(In reply to Günter Halt from comment #29)
> hi, when can i expect a patch or other solution for crash of leap15.5 on PC
> whit   AMD A10-9700 RADEON R7 ?
> 
> 
> The crash of 15.5 happens after recommended online update.

Is it really a "crash"?  Or just kernel spews warnings with stack trace?
Please try the kernel in OBS home:tiwai:bsc1213578-4 repo:
  http://download.opensuse.org/repositories/home:/tiwai:/bsc1213578-4/pool/

A known kernel Oops that happened on Andreas' machine was fixed there, at least.

If a crash still happens, please give the kernel log showing it.

> I am not sure,  can i install the recommended updates in leap15.4 ? 
> 15.4 runs stable.

You can get it from OBS download URL.  The Leap 15.4 update kernel is found at
  http://download.opensuse.org/update/leap/15.4/sle/
Comment 31 Günter Halt 2023-07-27 12:39:11 UTC
kernel-default-5.14.21-150500.2.1.g8a1bd74.x86_64.rpm

is this the right  kernel ?
----------------------

I have no idea, what crashes. 
  
if i install the last 15.5-iso  

first try: without online-repositories. The kernel is 
 5.14.21-150500.55.7-default
I can login on konsole and on  graphical surface too.
  
it runs stable. 


second try: the same iso, but with included online-reposities 
(doe chose of c++, Tcl/Tk , KDE-Elements ... ) 
The same kernel is installed (5.14.21-150500.55.7-default)

If i press <ctrl><alt><F1> before the display-manager pops up, i can login on konsole. 
(please wait for mseg-report an ps -Af report)

change to the graphic-display (<ctrl><alt><F7>), the login-surface raised and less one second it crashes without key-press -> (green screen) 

The difference between both installations is the display-manager    

Is is necessary to install your changed kernel?
Comment 32 Günter Halt 2023-07-27 12:40:15 UTC
Created attachment 868458 [details]
see filename
Comment 33 Günter Halt 2023-07-27 12:41:43 UTC
Created attachment 868459 [details]
Report ps -Af before crash
Comment 34 Günter Halt 2023-07-28 19:38:56 UTC
Congratulations

now i installed kernel-default-5.14.21-150500.2.1.g8a1bd74.x86_64.rpm
on my test-system. 
It works without crash.

To test it in my main system i need time. On Monday i give "green light" for success.

Best regards
Günter
Comment 35 Günter Halt 2023-07-29 07:08:05 UTC
o.k. now runs my main system whit the 
kernel-default-5.14.21-150500.2.1.g8a1bd74.x86_64.rpm
without problems.

But i must select it in the boot-menue using Advanced... 
In the main-entry Leap15.5,  the kernel ...55.7 will be used.

No big problem, i wait for an official released update.

Thank you.
Comment 36 Takashi Iwai 2023-07-29 07:12:57 UTC
Thanks, it's a good news.  The fix should be included in the next update kernel; it might be slipped from the upcoming one, but will be in the early August update.

Please reopen if you encounter the problem again.
Comment 37 Günter Halt 2023-08-03 13:26:42 UTC
today (03.08.2023) any updates was present, kernel 
 5.14.21-150500.55.12-default
too.

Fine, it runs! 

I repeat my Thank you for all arrangers.