Bug 1224820

Summary: Crash after resuming from suspend with Threadripper Pro/WRX90 chipset
Product: [openSUSE] openSUSE Distribution Reporter: Aaron Williams <aaron.w2>
Component: KernelAssignee: openSUSE Kernel Bugs <kernel-bugs>
Status: NEW --- QA Contact: E-mail List <qa-bugs>
Severity: Critical    
Priority: P5 - None CC: aaron.w2, tiwai
Version: Leap 15.6Flags: tiwai: needinfo? (aaron.w2)
Target Milestone: ---   
Hardware: x86-64   
OS: Other   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---

Description Aaron Williams 2024-05-21 21:38:03 UTC
I recently assembled a computer based on the ASUS WRX90E Sage motherboard and a 32-core AMD Threadripper PRO CPU. In the BIOS, I enabled 64-bit PCIe remapping support to allow the OS to remap devices to 64-bit address space.

For some odd reason, however, the system went into standby and could not recover. Pressing Control+Alt+F10 reported problems remapping the PCIe devices, and then there was a failure with BTRFS reading bad data, and the system hung.

The NVME drive is an old Intel PCIe card I had lying around that only had light usage.

I strongly suspect there are compatibility issues with the new motherboard chipset since it is pretty new. The WRX90 chipset is only used in two motherboards I am aware of, the ASUS board and an ASRock board.

I could not capture the messages directly since the FS went read-only and it started spewing a ton of BTRFS errors before I could capture the screen.

As for why it went into suspend mode to begin with, I have no idea. The reason I know it did is I sshed into the box remotely and got a broadcast message:
Broadcast message from aaronw@localhost (Tue 2024-05-21 14:09:53 PDT):

The system will suspend now!

I will try disabling S4/S5 support in the BIOS and see if that helps and I can also try disabling the 64-bit PCIe remapping as well.

System specs:
ASUS Pro WRX90E Sage SE EEB motherboard with latest BIOS firmware
AMD Threadripper Pro 7975WX CPU
Gigabyte Ge-Force RTX4090 Windforce V2 24G graphics card
Intel NVME drive
Kingston Fury Renegade Pro 128GB ECC Registered DDR5 (PC5 48000) memory.
Comment 1 Takashi Iwai 2024-05-23 13:45:45 UTC
It's difficult to judge what's happening without logs, unfortunately.
Let us know if you get more information / logs.  Thanks.
Comment 2 Takashi Iwai 2024-06-14 15:55:53 UTC
Also, please try the kernel in OBS Kernel:SLE15-SP6 repo
  http://download.opensuse.org/repositories/Kernel:/SLE15-SP6/pool
It's the build from the latest SLE15-SP6 git branch.