Bugzilla – Bug 145759
Installation not possible on SCSI-system; Module aic7xxx
Last modified: 2006-02-20 11:02:58 UTC
After selecting "Installation" in the boot menu of CD1 of Beta2, the installation starts but after the entry "starting yast" nothing else happens. (On Beta1 it went very slowly a bit further, but then stopped latest when writing the partition table). Hardware: Dual Pentium III Output of lspci: ------- 00:00.0 Host bridge: Intel Corporation 440BX/ZX/DX - 82443BX/ZX/DX Host bridge (rev 03) 00:01.0 PCI bridge: Intel Corporation 440BX/ZX/DX - 82443BX/ZX/DX AGP bridge (rev 03) 00:04.0 ISA bridge: Intel Corporation 82371AB/EB/MB PIIX4 ISA (rev 02) 00:04.1 IDE interface: Intel Corporation 82371AB/EB/MB PIIX4 IDE (rev 01) 00:04.2 USB Controller: Intel Corporation 82371AB/EB/MB PIIX4 USB (rev 01) 00:04.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 02) 00:06.0 SCSI storage controller: Adaptec AHA-2940U2/U2W / 7890/7891 00:0a.0 SCSI storage controller: LSI Logic / Symbios Logic 53c895 (rev 01) 00:0b.0 Ethernet controller: Digital Equipment Corporation DECchip 21142/43 (rev 41) 00:0c.0 Ethernet controller: Digital Equipment Corporation DECchip 21142/43 (rev 41) 01:00.0 VGA compatible controller: nVidia Corporation NV4 [RIVA TNT] (rev 04) -------- Output of dmesg has repeatedly the following entries: -------- sr 0:0:6:0: Device is active, asserting ATN Recovery code sleeping Recovery code awake Timer Expired aic7xxx_abort returns 0x2003 sd 0:0:0:0: Attempting to queue an ABORT message CDB: 0x28 0x0 0x0 0x0 0x0 0x40 0x0 0x0 0x80 0x0 scsi0: At time of recovery, card was not paused >>>>>>>>>>>>>>>>>> Dump Card State Begins <<<<<<<<<<<<<<<<< scsi0: Dumping Card State in Command phase, at SEQADDR 0xb8 Card was paused ACCUM = 0x80, SINDEX = 0xa0, DINDEX = 0xe4, ARG_2 = 0x2 HCNT = 0xc SCBPTR = 0x7 SCSISIGI[0x54]:(BSYI|ATNI|IOI) ERROR[0x0] SCSIBUSL[0x0] LASTPHASE[0x80]:(CDI) SCSISEQ[0x12]:(ENAUTOATNP|ENRSELI) SBLKCTL[0xa]:(SELWIDE|SELBUSB) SCSIRATE[0x15]:(SINGLE_EDGE) SEQCTL[0x10]:(FASTMODE) SEQ_FLAGS[0x0] SSTAT0[0x0] SSTAT1[0x3]:(REQINIT|PHASECHG) SSTAT2[0x50]:(EXP_ACTIVE|SHVALID) SSTAT3[0x8] SIMODE0[0x8]:(ENSWRAP) SIMODE1[0xac]:(ENSCSIPERR|ENBUSFREE|ENSCSIRST|ENSELTIMO) SXFRCTL0[0x80]:(DFON) DFCNTRL[0x24]:(DIRECTION|SCSIEN) DFSTATUS[0x80]:(PRELOAD_AVAIL) STACK: 0x0 0x167 0x17d 0x35 SCB count = 12 Kernel NEXTQSCB = 5 Card NEXTQSCB = 4 QINFIFO entries: 4 Waiting Queue entries: Disconnected Queue entries: QOUTFIFO entries: Sequencer Free SCB List: 6 5 4 3 2 0 1 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 Sequencer SCB Info: 0 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x7] SCB_LUN[0x0] SCB_TAG[0xff] 1 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x7] SCB_LUN[0x0] SCB_TAG[0xff] 2 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x7] SCB_LUN[0x0] SCB_TAG[0xff] 3 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x7] SCB_LUN[0x0] SCB_TAG[0xff] 4 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x7] SCB_LUN[0x0] SCB_TAG[0xff] 5 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x7] SCB_LUN[0x0] SCB_TAG[0xff] 6 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x7] SCB_LUN[0x0] SCB_TAG[0xff] 7 SCB_CONTROL[0x40]:(DISCENB) SCB_SCSIID[0x67] SCB_LUN[0x0] SCB_TAG[0xb] 8 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] 9 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] 10 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] 11 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] 12 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] 13 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] 14 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] 15 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] 16 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] 17 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] 18 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] 19 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] 20 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] 21 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] 22 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] 23 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] 24 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] 25 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] 26 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] 27 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] 28 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] 29 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] 30 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] 31 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] Pending list: 4 SCB_CONTROL[0x60]:(TAG_ENB|DISCENB) SCB_SCSIID[0x7] SCB_LUN[0x0] 11 SCB_CONTROL[0x40]:(DISCENB) SCB_SCSIID[0x67] SCB_LUN[0x0] Kernel Free SCB list: 6 0 3 7 2 1 10 9 8 <<<<<<<<<<<<<<<<< Dump Card State Ends >>>>>>>>>>>>>>>>>> scsi0:0:0:0: Cmd aborted from QINFIFO aic7xxx_abort returns 0x2002 sd 0:0:0:0: Attempting to queue an ABORT message CDB: 0x0 0x0 0x0 0x0 0x0 0x0 scsi0: At time of recovery, card was not paused >>>>>>>>>>>>>>>>>> Dump Card State Begins <<<<<<<<<<<<<<<<< scsi0: Dumping Card State in Command phase, at SEQADDR 0xb8 Card was paused ACCUM = 0x80, SINDEX = 0xa0, DINDEX = 0xe4, ARG_2 = 0x2 HCNT = 0xc SCBPTR = 0x7 SCSISIGI[0x54]:(BSYI|ATNI|IOI) ERROR[0x0] SCSIBUSL[0x0] LASTPHASE[0x80]:(CDI) SCSISEQ[0x12]:(ENAUTOATNP|ENRSELI) SBLKCTL[0xa]:(SELWIDE|SELBUSB) SCSIRATE[0x15]:(SINGLE_EDGE) SEQCTL[0x10]:(FASTMODE) SEQ_FLAGS[0x0] SSTAT0[0x0] SSTAT1[0x3]:(REQINIT|PHASECHG) SSTAT2[0x50]:(EXP_ACTIVE|SHVALID) SSTAT3[0x8] SIMODE0[0x8]:(ENSWRAP) SIMODE1[0xac]:(ENSCSIPERR|ENBUSFREE|ENSCSIRST|ENSELTIMO) SXFRCTL0[0x80]:(DFON) DFCNTRL[0x24]:(DIRECTION|SCSIEN) DFSTATUS[0x80]:(PRELOAD_AVAIL) STACK: 0x0 0x167 0x17d 0x35 SCB count = 12 Kernel NEXTQSCB = 4 Card NEXTQSCB = 5 QINFIFO entries: 5 Waiting Queue entries: Disconnected Queue entries: QOUTFIFO entries: ... -------- Notes: Another machine with an Adaptec controler had the same issues. Choosing safe settings etc. doesn't make a difference. Installation of SLES9 on same hardware is no problem.
have you tried with aic7xxx_old?
No - what do I have to enter at the boot prompt to use that module?
boot with manual=1 and select aic7xxx_old at the modules list. look also at loaded modules after that if it loaded and not aic7xxx.
aic7xxx_old worked to some extent. After unloading aic7xxx and loading aic7xxx_old YaST started up okay and I could partition the harddisk etc. Software installation proceeded as expected, but the machine didn't start up after reboot. I was offered a shell prompt only. I started the installation again (as the rescue system didn't seem to work) and modified the initrd (changed to console 2, mounted the root partition, chrooted to it, edited /etc/sysconfig/kernel to include aic7xxx_old, called mkinitrd, exited, unmounted root partition, reset). This helped a bit, the machine found the root partition and mounted it. However a little later there was a kernel panic, the output looked to me like it tried to load aic7xxx anyway and then failed. (But maybe aic7xxx_old also uses aic7xxx in its output, so it could also have been the aic7xxx_old module.)
any chance to capture the panic?
The last messages on the screen (typed off by hand, so might have some typos): aic7xxx: PCI0:6:0 Mem Region 0xdd800000 unavailable. Cannot memory map device aic7xxx: PCI0:6:0 IO Region 0xdd000[0..255] unavailable. Cannot map device aic7xxx: probe of 0000:00:06.0 failed with error -12 (scsi0) BRKADRINT error (0xff) Illegal Host Access Illegal Sequence Address referenced Illegal Opcode in sequencer program Sequencer Ram Parity Error Data Path Ram Parity Error Scratch Ram/SCB Array Ram Parity Error PCI Error detected (scsi0) SEQADDR=0xff Kernel panic - not syncing: aic7xxx unrecoverable BRKADRINT (Note: this is not from the dual processor machine, but the other one. Don't know if that makes any difference.)
(scsi0) SEQADDR=0xff above should read (scsi0) SEQADDR=0x1ff
Hannes: any ideas?
Created attachment 66055 [details] dmesg output Encounterd this bug too. System (PIII-SCSI) worked with all previous distros up to 10.0. Obviously this bug is not too seldom. Piped dmesg output to attached file.
yeah, I know. aic7{x,9}xx are notoriously bad at error recovery. Quite surprisingly, given the amount of code involved :-( Anyway, if you want to use aic7xxx_old be sure to enter an alias in /etc/modprobe.conf, otherwise udev will load the wrong adapter. I see what I can do for fixing this.
Created attachment 66313 [details] aic7xxx-disable-tcq-fix Setting default queue depth to '1' instead of '2' if tagged queueing is disabled.
I have found an error (or at least a questionable behaviour) in the aic7xxx driver; it sets the queue depth to '2' if tagged queueing is disabled. Hence the midlayer might send two commands simultaneously which can confuse the driver and / or the device in question. If you have the means to compile a kernel yourself please test the above patch. Otherwise please wait for Beta4 which will include the patch. Oh, and please post the results. Your patience is very much appreciated here, as the aic7xxx driver occasionally moves in mysterious ways.
Well, actually I've found some more errors (bugzilla #148061). A fix will be in Beta4. Can you please re-test and check whether the problem is fixed?
Sun W2100z dual opteron boxes -- same issue with 10.1 beta2 and beta3 (64 bit versions) I booted the mini iso. same issue. I tried the CD1 with manual=1 -- once I loaded the aic7* module the box hung. Actually the 32bit version of beta3 almost worked, but not quite. I'll try the aic*_old module and see what happens.
More details: 13:04.0 SCSI storage controller: Adaptec AIC-7902B U320 (rev 10) Subsystem: Sun Microsystems Computer Corp.: Unknown device 534d Flags: bus master, 66MHz, slow devsel, latency 72, IRQ 169 I/O ports at 4400 [disabled] [size=256] Memory at 00000000ea000000 (64-bit, non-prefetchable) [size=8K] I/O ports at 4000 [disabled] [size=256] [virtual] Expansion ROM at 00000000e0200000 [disabled] [size=512K] Capabilities: [dc] Power Management version 2 Capabilities: [a0] Message Signalled Interrupts: 64bit+ Queue=0/1 Enable- Capabilities: [94] PCI-X non-bridge device 13:04.1 SCSI storage controller: Adaptec AIC-7902B U320 (rev 10) Subsystem: Sun Microsystems Computer Corp.: Unknown device 534d Flags: bus master, 66MHz, slow devsel, latency 72, IRQ 177 I/O ports at 4c00 [disabled] [size=256] Memory at 00000000ea010000 (64-bit, non-prefetchable) [size=8K] I/O ports at 4800 [disabled] [size=256] [virtual] Expansion ROM at 00000000e0280000 [disabled] [size=512K] Capabilities: [dc] Power Management version 2 Capabilities: [a0] Message Signalled Interrupts: 64bit+ Queue=0/1 Enable- Capabilities: [94] PCI-X non-bridge device
I tried aic7xxx_old. It gives and error -1 No such device. I tried loading the aic* modules in a different order. I'm not sure this is signficant, but first I load amd74xx (for ide), next I load aic7xxx. Next when I load aic79xx I get something like: BUG spinlock recursion swapper/0 spinlock lockup. I was able to replicate this several times.. (just don't have a console to capture the output..)
Tom, do you have the same problem?
Sorry, no time to test this now. Machine is in Room 3.1.9.
Thomas' machine works with KOTD.
I had another try (or better several tries) with Beta 3.91, however the installation failed each time with /usr/lib/YaST2/startup/YaST2.call: line 301: 2380 Aborted (core dumped) Y2base "Y2_MODULE_NAME" $Y2_MODE_FLAGS $Y2_MODULE_ARGS $Y2_MODE $Y2_QT_ARGS y2log says: [DEFINE_LOGGROUP] Exception.cc(log):83 MediaCD.cc(MediaCD):65 THROW Unsupported scheme in the URL cd:/?devices=/dev/sr0 As I didn't see any error messages as in my first description and as it happened with aic7xxx and aic7xxx_old I could imagine that it is not related to the original issue. In any event, due to the above I cannot find out if the matter has been fixed or not.
Installing Beta4 on the hardware where I first observed the bug now went well. Thank you for fixing this bug!
Ah. Good. finally.