Bug 1213537

Summary: myrb / myrs drivers cause hang on boot when using a Mylex AcceleRAID 170 (DAC960)
Product: [openSUSE] openSUSE Distribution Reporter: Mike Edwards <medwards-suse>
Component: KernelAssignee: Hannes Reinecke <hare>
Status: RESOLVED NORESPONSE QA Contact: E-mail List <qa-bugs>
Severity: Major    
Priority: P5 - None CC: lduncan, medwards-suse, tiwai
Version: Leap 15.5Flags: hare: needinfo? (medwards-suse)
Target Milestone: ---   
Hardware: x86-64   
OS: openSUSE Leap 15.5   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: dmesg

Description Mike Edwards 2023-07-21 00:16:51 UTC
Created attachment 868350 [details]
dmesg

Booting opensuse leap 15.5 fails on this machine - shortly after the myrb and myrs drivers load and detect the AcceleRAID 170 and drives connected to this card, booting simply stops.  Waiting for a few minutes will let me see kernel warnings about hung processes, but booting never completes at this point.

Adding modprobe.blacklist=myrb and modprobe.blacklist=myrs to the kernel commandline lets boot complete, though obviously without the AcceleRAID card in use.

Once booted, I was able to enable dyndbg for these two drivers:
# for d in myr{s,b}; do echo 'file drivers/scsi/${d}.c +p' > /sys/kernel/debug/dynamic_debug/control; done

Loading the drivers manually caused the modprobe for myrs to hang (myrb loaded w/o issue), leading me to believe the problem lies within the myrs driver.


dyndbg of myrb + myrs:

drivers/scsi/myrs.c:2114 [myrs]myrs_monitor =_ "monitor tick\012"
drivers/scsi/myrs.c:1822 [myrs]myrs_slave_alloc =_ "Logical device mapping %d:%d:%d -> %d\012"
drivers/scsi/myrs.c:471 [myrs]myrs_get_fwstatus =_ "Sending GetHealthStatus\012"
drivers/scsi/myrs.c:338 [myrs]myrs_get_pdev_info =_ "Sending GetPhysicalDeviceInfoValid for pdev %d:%d:%d\012"
drivers/scsi/myrs.c:249 [myrs]myrs_get_ldev_info =_ "Sending GetLogicalDeviceInfoValid for ldev %d\012"
drivers/scsi/myrs.c:189 [myrs]myrs_get_ctlr_info =_ "Sending GetControllerInfo\012"
drivers/scsi/myrb.c:2445 [myrb]myrb_monitor =_ "reschedule monitor\012"
drivers/scsi/myrb.c:2436 [myrb]myrb_monitor =_ "new enquiry\012"
drivers/scsi/myrb.c:2432 [myrb]myrb_monitor =_ "get background init status\012"
drivers/scsi/myrb.c:2427 [myrb]myrb_monitor =_ "get consistency check progress\012"
drivers/scsi/myrb.c:2421 [myrb]myrb_monitor =_ "get rebuild progress\012"
drivers/scsi/myrb.c:2415 [myrb]myrb_monitor =_ "get logical drive info\012"
drivers/scsi/myrb.c:2409 [myrb]myrb_monitor =_ "get rebuild progress\012"
drivers/scsi/myrb.c:2403 [myrb]myrb_monitor =_ "get error table\012"
drivers/scsi/myrb.c:2397 [myrb]myrb_monitor =_ "get event log no %d/%d\012"
drivers/scsi/myrb.c:2390 [myrb]myrb_monitor =_ "monitor tick\012"
drivers/scsi/myrb.c:2360 [myrb]myrb_handle_scsi =_ "Device nonresponsive\012"
drivers/scsi/myrb.c:2355 [myrb]myrb_handle_scsi =_ "Attempt to Access Beyond End of Logical Drive"
drivers/scsi/myrb.c:2350 [myrb]myrb_handle_scsi =_ "Logical Drive Nonexistent or Offline"
drivers/scsi/myrb.c:2331 [myrb]myrb_handle_scsi =_ "Bad Data Encountered\012"
drivers/scsi/myrb.c:1699 [myrb]myrb_pdev_slave_alloc =_ "slave alloc pdev %d:%d state %x\012"
drivers/scsi/myrb.c:1693 [myrb]myrb_pdev_slave_alloc =_ "device not present, skip\012"
drivers/scsi/myrb.c:1687 [myrb]myrb_pdev_slave_alloc =_ "Failed to get device state, status %x\012"
drivers/scsi/myrb.c:1639 [myrb]myrb_ldev_slave_alloc =_ "slave alloc ldev %d state %x\012"
drivers/scsi/myrb.c:1438 [myrb]myrb_ldev_queuecommand =_ "ldev %u in state %x, skip\012"
drivers/scsi/myrb.c:1414 [myrb]myrb_read_capacity =_ "Capacity %u, blocksize %u\012"


See attachment for dmesg output.  myrs was loaded around timestamp 16372.039529.
Comment 1 Takashi Iwai 2023-07-21 06:06:52 UTC
Something for storage team.  Reassigned.
Comment 2 Hannes Reinecke 2023-07-21 06:11:33 UTC
Hmm. Once you enabled dyndbg you should be seeing some output in the normal kernel message log.
Can you force a crashdump once the system is hung?
Comment 3 Hannes Reinecke 2024-04-30 15:55:40 UTC
Closing, no response.