Bug 1227363 - PM1743 vroc raid1 install SLE-15-SP6.os into emergency mode
Summary: PM1743 vroc raid1 install SLE-15-SP6.os into emergency mode
Status: RESOLVED FIXED
Alias: None
Product: PUBLIC SUSE Linux Enterprise Server 15 SP6
Classification: openSUSE
Component: Kernel (show other bugs)
Version: unspecified
Hardware: x86-64 Other
: P5 - None : Normal
Target Milestone: ---
Assignee: Kernel Bugs
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-07-04 03:16 UTC by Jiwei Sun
Modified: 2024-07-04 08:28 UTC (History)
4 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Jiwei Sun 2024-07-04 03:16:56 UTC
Reproduce Steps:
1. Use two PM1743 1.92TB group vroc raid1
2. Install the SLE-15-SP6 system
3. The system installation is completed. The system restarts into emergency mode.

Investigation:
We add "rd.udev.debug" into cmdline, we found the following log,

  (udev-worker)[2149]: nvme1n1: '/sbin/mdadm -I /dev/nvme1n1'(err) 'mdadm: Unable to get real path for '/sys/bus/pci/drivers/vmd/0000:c7:00.5/domain/device''
  (udev-worker)[2149]: nvme1n1: '/sbin/mdadm -I /dev/nvme1n1'(err) 'mdadm: /dev/nvme1n1 is not attached to Intel(R) RAID controller.'
  (udev-worker)[2149]: nvme1n1: '/sbin/mdadm -I /dev/nvme1n1'(err) 'mdadm: No OROM/EFI properties for /dev/nvme1n1'
  (udev-worker)[2149]: nvme1n1: '/sbin/mdadm -I /dev/nvme1n1'(err) 'mdadm: no RAID superblock on /dev/nvme1n1.'
  (udev-worker)[2149]: nvme1n1: Process '/sbin/mdadm -I /dev/nvme1n1' failed with exit code 1.

And according to our analysis, the issue's root cause is as following,
After a NVMe disk is probed/added by the nvme driver, the udevd executes
some rule scripts by invoking mdadm command to detect if there is a
mdraid associated with this NVMe disk. The mdadm determines if one
NVMe devce is connected to a particular VMD domain by checking the
domain symlink. Here is the root cause:

Thread A                   Thread B             Thread mdadm
vmd_enable_domain
  pci_bus_add_devices
    __driver_probe_device
     ...
     work_on_cpu
       schedule_work_on
       : wakeup Thread B
                           nvme_probe
                           : wakeup scan_work
                             to scan nvme disk
                             and add nvme disk
                             then wakeup udevd
                                                : udevd executes
                                                  mdadm command
       flush_work                               main
       : wait for nvme_probe done                ...
    __driver_probe_device                        find_driver_devices
    : probe next nvme device                     : 1) Detect the domain
    ...                                            symlink; 2) Find the
    ...                                            domain symlink from
    ...                                            vmd sysfs; 3) The
    ...                                            domain symlink is not
    ...                                            created yet, failed
  sysfs_create_link
  : create domain symlink

sysfs_create_link is invoked at the end of vmd_enable_domain. However,
this implementation introduces a timing issue, where mdadm might fail
to retrieve the vmd symlink path because the symlink has not been
created yet.

Please refer to the following link
https://lore.kernel.org/linux-pci/20240603140329.7222-1-sjiwei@163.com/t/#u

Could you please help to backport the following patch into sles15sp6 kernel?
https://git.kernel.org/pub/scm/linux/kernel/git/pci/pci.git/commit/?h=vmd&id=7a13782e6150154abdf34ced3b733502275a16d1
Comment 1 Takashi Iwai 2024-07-04 08:28:28 UTC
Thanks for the report.

I backported the fix PCI patch now to SLE15-SP6 branch.
It likely slipped from the upcoming update in July, but will be included afterwards.