Bug 1215182 - SSD's not detected in AHCI mode on GA890GPA UD3M mainboard (AMD870/SB850 chipset)
Summary: SSD's not detected in AHCI mode on GA890GPA UD3M mainboard (AMD870/SB850 chip...
Status: RESOLVED FIXED
Alias: None
Product: openSUSE Tumbleweed
Classification: openSUSE
Component: Kernel (show other bugs)
Version: Current
Hardware: x86-64 openSUSE Tumbleweed
: P5 - None : Normal (vote)
Target Milestone: ---
Assignee: openSUSE Kernel Bugs
QA Contact: E-mail List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-09-10 13:17 UTC by Michael Sauereisen
Modified: 2023-09-14 11:29 UTC (History)
2 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments
dmesg output from leap 15.5 live iso (95.26 KB, text/plain)
2023-09-11 10:25 UTC, Michael Sauereisen
Details
dmesg output from tumbleweed live iso (88.16 KB, text/plain)
2023-09-11 10:26 UTC, Michael Sauereisen
Details
hwinfo output from leap 15.5 live iso (709.02 KB, text/plain)
2023-09-11 10:27 UTC, Michael Sauereisen
Details
hwinfo output from tumbleweed live iso (614.63 KB, text/plain)
2023-09-11 10:28 UTC, Michael Sauereisen
Details
dmesg output after ahci-bind command (2.33 KB, text/plain)
2023-09-11 20:47 UTC, Michael Sauereisen
Details
dmesg from 6.5.2-lp154.1 kernel (87.63 KB, text/plain)
2023-09-13 09:53 UTC, Michael Sauereisen
Details
dmesg from 6.5.2-lp154.2 TW kernel (88.57 KB, text/plain)
2023-09-13 09:55 UTC, Michael Sauereisen
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Sauereisen 2023-09-10 13:17:33 UTC
I tried to install openSUSE-Tumbleweed-DVD-x86_64-Snapshot20230904-Media.iso but the SSD's in my system were not detected. (currently running on Leap 15.3)

Then I tried to locate the issue with openSUSE-Leap-15.5-KDE-Live-x86_64-Build10.117-Media and openSUSE-Tumbleweed-KDE-Live-x86_64-Snapshot20230906-Media.

Leap 15.5 did see the SSD's and they could be mounted as expected.

lspci -vv -k provided this info about the SATA controller:
00:11.0 SATA controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 SATA Controller [AHCI mode] (rev 40) (prog-if 01 [AHCI 1.0])
        Subsystem: Gigabyte Technology Co., Ltd GA-880GMA-USB3
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 32
        Interrupt: pin A routed to IRQ 28
        NUMA node: 0
        Region 0: I/O ports at ff00 [size=8]
        Region 1: I/O ports at fe00 [size=4]
        Region 2: I/O ports at fd00 [size=8]
        Region 3: I/O ports at fc00 [size=4]
        Region 4: I/O ports at fb00 [size=16]
        Region 5: Memory at fe02f000 (32-bit, non-prefetchable) [size=1K]
        Capabilities: [50] MSI: Enable+ Count=1/4 Maskable- 64bit+
                Address: 00000000fee02004  Data: 0025
        Capabilities: [70] SATA HBA v1.0 InCfgSpace
        Capabilities: [a4] PCI Advanced Features
                AFCap: TP+ FLR+
                AFCtrl: FLR-
                AFStatus: TP-
        Kernel driver in use: ahci
        Kernel modules: ahci

When I started the Tubleweed Live iso and did the same lspci -vv -k I got this output:

00:11.0 SATA controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 SATA Controller [AHCI mode] (rev 40) (prog-if 01 [AHCI 1.0])
        Subsystem: Gigabyte Technology Co., Ltd GA-78/880-series motherboard
        Flags: 66MHz, medium devsel, IRQ 19, NUMA node 0
        I/O ports at ff00 [size=8]
        I/O ports at fe00 [size=4]
        I/O ports at fd00 [size=8]
        I/O ports at fc00 [size=4]
        I/O ports at fb00 [size=16]
        Memory at fe02f000 (32-bit, non-prefetchable) [size=1K]
        Capabilities: [50] MSI: Enable- Count=1/4 Maskable- 64bit+
        Capabilities: [70] SATA HBA v1.0
        Capabilities: [a4] PCI Advanced Features

as you can see there were no kernel modules loaded.

This is what I found in dmesg:

linux:/home/linux # dmesg|grep ahci
[    7.625748] ahci 0000:00:11.0: version 3.0
[    7.625909] ahci: probe of 0000:00:11.0 failed with error -12
[    7.636336] ahci 0000:06:00.0: AHCI 0001.0000 32 slots 2 ports 3 Gbps 0x3 impl SATA mode
[    7.636342] ahci 0000:06:00.0: flags: 64bit ncq pm led clo pmp pio slum part
[    7.636692] scsi host0: ahci
[    7.636868] scsi host1: ahci

Could it be that the old AMD870/SB850 chipset is no longer supported in the new kernel line?
Comment 1 Takashi Iwai 2023-09-11 09:12:48 UTC
Could you give the full dmesg outputs from both working and non-working cases?
Also give the hwinfo output from the working system, too.
Comment 2 Michael Sauereisen 2023-09-11 10:25:53 UTC
Created attachment 869419 [details]
dmesg output from leap 15.5 live iso
Comment 3 Michael Sauereisen 2023-09-11 10:26:41 UTC
Created attachment 869420 [details]
dmesg output from tumbleweed live iso
Comment 4 Michael Sauereisen 2023-09-11 10:27:20 UTC
Created attachment 869422 [details]
hwinfo output from leap 15.5 live iso
Comment 5 Michael Sauereisen 2023-09-11 10:28:41 UTC
Created attachment 869423 [details]
hwinfo output from tumbleweed live iso

dmesg and hwinfo outputs as requested
Comment 6 Takashi Iwai 2023-09-11 13:56:48 UTC
Thanks.

The bug comes from
[    7.636435] ioremap error for 0xfe02f000-0xfe030000, requested 0x2, got 0x0
that follows
[    7.636449] ahci: probe of 0000:00:11.0 failed with error -12
where -12 is -ENOMEM.

I'm not sure exactly whether it's a regression, or just some matter of probing order.

On TW system after the boot, could you try to check the content of /sys/bus/pci/drivers/ahci:
  % ls /sys/bus/pci/drivers/ahci/

Isn't 0000:00:11.0 found there but 0000:06:00.0 is present?
If yes, try to re-bind 0000:00:11.0, e.g. run the following as root:
  % echo -n "0000:00:11.0" > /sys/bus/pci/drivers/acpi/bind
Comment 7 Michael Sauereisen 2023-09-11 20:47:39 UTC
Created attachment 869433 [details]
dmesg output after ahci-bind command
Comment 8 Michael Sauereisen 2023-09-11 20:54:29 UTC
tanks for the quick responds - and yes you were right the line 0000:00:11.0 was not found there but 0000:06:00.0 is present.
After echo -n "0000:00:11.0" > /sys/bus/pci/drivers/ahci/bind
the ahci initialisation worked and the sata ssd was found and I was able to mount it. See also the output from dmesg.

So again you were right it looks more like a probing issue
Comment 9 Takashi Iwai 2023-09-12 06:53:53 UTC
Thanks for quick testing.

It's an interesting case, and this happens likely because of the kernel config difference: TW has built-in SCSI drivers while SLE/Leap has modules, hence the probe order is different between them.  So I guess this is no recent regression but present for long time on TW.

Which system do you have on this machine for now?  It's still Leap 15.3 or Leap 15.5?

I'm building test kernels in OBS home:tiwai:bsc1215182 repo with the modified kconfigs to make SCSI/AHCI modular.  The 6.5.2 kernel will be available for both TW and Leap 15.x kernels in "standard" and "backport" repository, respectively.
Try to install the test kernel that corresponds to your already installed system, and check whether it boots.
Comment 10 Michael Sauereisen 2023-09-12 14:32:16 UTC
I have currently still the 15.3 system running - which causes another problem with your new kernel-default-6.5.2-1.1.gfd005b9.x86_64 which I was offered by zypper after adding the repository:

kernel is in conflict with 'filesystem < 16',
currently on filesystem-15.0-11.8.1.x86_64.

I still have an old reiserfs partition on one of my disks. I guess it's about time for a migration first before moving on with an upgrade.
Comment 11 Takashi Iwai 2023-09-12 14:34:02 UTC
In that case, install the kernel from "backport" repo instead:
  http://download.opensuse.org/repositories/home:/tiwai:/bsc1215182/backport/
It's the build for Leap 15.x.
Comment 12 Michael Sauereisen 2023-09-12 15:12:52 UTC
thanks for the quick solution, the kernel from "backport" repo is working and the SATA SDD with AHCI is working as well! 

The only modification I had to do was to reinstall the old Realtek ethernet package for r8169/r8168 (I'm keeping that one always in my download directory... so no real trouble)
Comment 13 Takashi Iwai 2023-09-12 15:28:56 UTC
Good to hear.  So that's the workaround.

I'll need to ping kernel devs about the change of kconfig.  Let's see.
Comment 14 Takashi Iwai 2023-09-12 15:30:46 UTC
Oh, and just to make sure: could you install another kernel from OBS Kernel:stable:Backport repo?
  http://download.opensuse.org/repositories/Kernel:/stable:/Backport/standard/

This kernel is built from the equivalent code, but with the same config as TW.  That means, this kernel must fail to probe the SSD.  So, to confirm the negative result.
Comment 15 Michael Sauereisen 2023-09-12 18:26:29 UTC
ok here it comes: both kernel are working, even the vmlinux-6.5.2-lp154.2.gfdde566-default.xz from the stable backport as you suggested.

So I started again the TW live iso - same issue of course. I checked which kernel this iso has: 6.4.12-1.

Then I saw that about 125 software updates were offered and I was curious what there is included. And yes there was a kernel 6.5.2 included.

For now my conclusion is that somebody already has fixed the issue with kconfig in the new kernel 6.5.2.

Best for me would be to postpone my upgrade until later in October and give it another try. I'll leave it up to you to follow up with kernel devs.

Thank you very much for your support.
Comment 16 Takashi Iwai 2023-09-12 19:24:07 UTC
OK, that's an interesting result.
To be sure, could you give dmesg outputs from both kernels?
Comment 17 Michael Sauereisen 2023-09-13 09:53:05 UTC
Created attachment 869472 [details]
dmesg from 6.5.2-lp154.1 kernel
Comment 18 Michael Sauereisen 2023-09-13 09:55:15 UTC
Created attachment 869473 [details]
dmesg from 6.5.2-lp154.2 TW kernel
Comment 19 Takashi Iwai 2023-09-14 11:29:42 UTC
Thanks, both look OK, indeed.

I'm not sure what was the cause and the fix, but it seems working, and let's close.

FWIW, I got positive reactions for my proposal to changes the kernel config to make SCSI/AHCI to modules, and I'm going to apply it to TW kernel.