Bug 1224820 - Crash after resuming from suspend with Threadripper Pro/WRX90 chipset
Summary: Crash after resuming from suspend with Threadripper Pro/WRX90 chipset
Status: NEW
Alias: None
Product: openSUSE Distribution
Classification: openSUSE
Component: Kernel (show other bugs)
Version: Leap 15.6
Hardware: x86-64 Other
: P5 - None : Critical (vote)
Target Milestone: ---
Assignee: openSUSE Kernel Bugs
QA Contact: E-mail List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-05-21 21:38 UTC by Aaron Williams
Modified: 2024-06-14 15:55 UTC (History)
2 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---
tiwai: needinfo? (aaron.w2)


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Aaron Williams 2024-05-21 21:38:03 UTC
I recently assembled a computer based on the ASUS WRX90E Sage motherboard and a 32-core AMD Threadripper PRO CPU. In the BIOS, I enabled 64-bit PCIe remapping support to allow the OS to remap devices to 64-bit address space.

For some odd reason, however, the system went into standby and could not recover. Pressing Control+Alt+F10 reported problems remapping the PCIe devices, and then there was a failure with BTRFS reading bad data, and the system hung.

The NVME drive is an old Intel PCIe card I had lying around that only had light usage.

I strongly suspect there are compatibility issues with the new motherboard chipset since it is pretty new. The WRX90 chipset is only used in two motherboards I am aware of, the ASUS board and an ASRock board.

I could not capture the messages directly since the FS went read-only and it started spewing a ton of BTRFS errors before I could capture the screen.

As for why it went into suspend mode to begin with, I have no idea. The reason I know it did is I sshed into the box remotely and got a broadcast message:
Broadcast message from aaronw@localhost (Tue 2024-05-21 14:09:53 PDT):

The system will suspend now!

I will try disabling S4/S5 support in the BIOS and see if that helps and I can also try disabling the 64-bit PCIe remapping as well.

System specs:
ASUS Pro WRX90E Sage SE EEB motherboard with latest BIOS firmware
AMD Threadripper Pro 7975WX CPU
Gigabyte Ge-Force RTX4090 Windforce V2 24G graphics card
Intel NVME drive
Kingston Fury Renegade Pro 128GB ECC Registered DDR5 (PC5 48000) memory.
Comment 1 Takashi Iwai 2024-05-23 13:45:45 UTC
It's difficult to judge what's happening without logs, unfortunately.
Let us know if you get more information / logs.  Thanks.
Comment 2 Takashi Iwai 2024-06-14 15:55:53 UTC
Also, please try the kernel in OBS Kernel:SLE15-SP6 repo
  http://download.opensuse.org/repositories/Kernel:/SLE15-SP6/pool
It's the build from the latest SLE15-SP6 git branch.