Bug 145147 - hald cdrom polling causes scsi errors
Summary: hald cdrom polling causes scsi errors
Status: VERIFIED FIXED
Alias: None
Product: SUSE Linux 10.1
Classification: openSUSE
Component: Basesystem (show other bugs)
Version: Beta 1
Hardware: Other Other
: P5 - None : Normal (vote)
Target Milestone: ---
Assignee: Danny Al-Gaaf
QA Contact: E-mail List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-01-24 12:14 UTC by Ludwig Nussel
Modified: 2007-06-05 11:17 UTC (History)
0 users

See Also:
Found By: Other
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments
SG_IO variant of the test program (1.05 KB, text/x-csrc)
2006-01-26 19:41 UTC, Jens Axboe
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Ludwig Nussel 2006-01-24 12:14:23 UTC
When inserting a disc into the scsi drive /var/log/messages fills with lots of messages:

Jan 24 12:53:39 g241 klogd: scsi0:A:6: Missed busfree. Lastphase = 0xe0, Curphase = 0xe4
Jan 24 12:53:39 g241 klogd:  target0:0:6: FAST-10 SCSI 10.0 MB/s ST (100 ns, offset 8)
Jan 24 12:53:39 g241 klogd: sr 0:0:6:0: Attempting to queue an ABORT message
Jan 24 12:53:39 g241 klogd: CDB: 0x4a 0x1 0x0 0x0 0x10 0x0 0x0 0x0 0x8 0x0 0x0 0x0
Jan 24 12:53:39 g241 klogd: sr 0:0:6:0: Command not found
Jan 24 12:53:39 g241 klogd: aic7xxx_abort returns 0x2002
Jan 24 12:53:41 g241 klogd: scsi0:A:6: Missed busfree. Lastphase = 0xe0, Curphase = 0xe4
Jan 24 12:53:41 g241 klogd: sr 0:0:6:0: Attempting to queue an ABORT message
Jan 24 12:53:41 g241 klogd: CDB: 0x4a 0x1 0x0 0x0 0x10 0x0 0x0 0x0 0x8 0x0 0x0 0x0
Jan 24 12:53:41 g241 klogd: sr 0:0:6:0: Command not found
Jan 24 12:53:41 g241 klogd: aic7xxx_abort returns 0x2002
Jan 24 12:53:43 g241 klogd: scsi0:A:6: Missed busfree. Lastphase = 0xe0, Curphase = 0xe4
Jan 24 12:53:43 g241 klogd: sr 0:0:6:0: Attempting to queue an ABORT message
Jan 24 12:53:43 g241 klogd: CDB: 0x4a 0x1 0x0 0x0 0x10 0x0 0x0 0x0 0x8 0x0 0x0 0x0
Jan 24 12:53:43 g241 klogd: scsi0: At time of recovery, card was not paused
Jan 24 12:53:43 g241 klogd: >>>>>>>>>>>>>>>>>> Dump Card State Begins <<<<<<<<<<<<<<<<<
Jan 24 12:53:43 g241 klogd: scsi0: Dumping Card State in Message-out phase, at SEQADDR 0x16f
Jan 24 12:53:43 g241 klogd: Card was paused
Jan 24 12:53:43 g241 klogd: ACCUM = 0xa0, SINDEX = 0x61, DINDEX = 0xe4, ARG_2 = 0x7
Jan 24 12:53:43 g241 klogd: HCNT = 0x0 SCBPTR = 0x4
Jan 24 12:53:43 g241 klogd: SCSISIGI[0xa4]:(BSYI|MSGI|CDI) ERROR[0x0] SCSIBUSL[0x1] 
Jan 24 12:53:43 g241 klogd: LASTPHASE[0xa0]:(MSGI|CDI) SCSISEQ[0x12]:(ENAUTOATNP|ENRSELI) 
Jan 24 12:53:43 g241 klogd: SBLKCTL[0xa]:(SELWIDE|SELBUSB) SCSIRATE[0x18]:(SINGLE_EDGE) 
Jan 24 12:53:43 g241 klogd: SEQCTL[0x10]:(FASTMODE) SEQ_FLAGS[0x40]:(NO_CDB_SENT) 
Jan 24 12:53:43 g241 klogd: SSTAT0[0x2]:(SPIORDY) SSTAT1[0x13]:(REQINIT|PHASECHG|PHASEMIS) 
Jan 24 12:53:43 g241 klogd: SSTAT2[0x10]:(EXP_ACTIVE) SSTAT3[0x0] SIMODE0[0x8]:(ENSWRAP) 
Jan 24 12:53:43 g241 klogd: SIMODE1[0xac]:(ENSCSIPERR|ENBUSFREE|ENSCSIRST|ENSELTIMO) 
Jan 24 12:53:43 g241 klogd: SXFRCTL0[0x88]:(SPIOEN|DFON) DFCNTRL[0x0] DFSTATUS[0x89]:(FIFOEMP|HDONE|PRELOAD_AVAIL) 
Jan 24 12:53:43 g241 klogd: STACK: 0xe5 0x45 0x167 0x17d
Jan 24 12:53:43 g241 klogd: SCB count = 36
Jan 24 12:53:43 g241 klogd: Kernel NEXTQSCB = 4
Jan 24 12:53:43 g241 klogd: Card NEXTQSCB = 4
Jan 24 12:53:43 g241 klogd: QINFIFO entries: 
Jan 24 12:53:43 g241 klogd: Waiting Queue entries: 
Jan 24 12:53:43 g241 klogd: Disconnected Queue entries: 
Jan 24 12:53:43 g241 klogd: QOUTFIFO entries: 
Jan 24 12:53:43 g241 klogd: Sequencer Free SCB List: 14 25 28 22 7 15 5 20 18 29 21 10 6 19 9 2 23 27 16 30 17 11 31 26 8 3 1 24 0 13 12 
Jan 24 12:53:43 g241 klogd: Sequencer SCB Info: 
Jan 24 12:53:43 g241 klogd:   0 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x7] 
Jan 24 12:53:43 g241 klogd: SCB_LUN[0x0] SCB_TAG[0xff] 
Jan 24 12:53:43 g241 klogd:   1 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x7] 
Jan 24 12:53:43 g241 klogd: SCB_LUN[0x0] SCB_TAG[0xff] 
Jan 24 12:53:43 g241 klogd:   2 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x7] 
Jan 24 12:53:43 g241 klogd: SCB_LUN[0x0] SCB_TAG[0xff] 
Jan 24 12:53:43 g241 klogd:   3 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x7] 
Jan 24 12:53:43 g241 klogd: SCB_LUN[0x0] SCB_TAG[0xff] 
Jan 24 12:53:43 g241 klogd:   4 SCB_CONTROL[0x0] SCB_SCSIID[0x67] SCB_LUN[0x0] SCB_TAG[0x1c] 
Jan 24 12:53:43 g241 klogd:   5 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x7] 
Jan 24 12:53:43 g241 klogd: SCB_LUN[0x0] SCB_TAG[0xff] 
Jan 24 12:53:43 g241 klogd:   6 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x7] 
Jan 24 12:53:43 g241 klogd: SCB_LUN[0x0] SCB_TAG[0xff] 
Jan 24 12:53:43 g241 klogd:   7 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x7] 
Jan 24 12:53:43 g241 klogd: SCB_LUN[0x0] SCB_TAG[0xff] 
Jan 24 12:53:43 g241 klogd:   8 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x7] 
Jan 24 12:53:43 g241 klogd: SCB_LUN[0x0] SCB_TAG[0xff] 
Jan 24 12:53:43 g241 klogd:   9 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x7] 
Jan 24 12:53:43 g241 klogd: SCB_LUN[0x0] SCB_TAG[0xff] 
Jan 24 12:53:43 g241 klogd:  10 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x7] 
Jan 24 12:53:43 g241 klogd: SCB_LUN[0x0] SCB_TAG[0xff] 
Jan 24 12:53:43 g241 klogd:  11 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x7] 
Jan 24 12:53:43 g241 klogd: SCB_LUN[0x0] SCB_TAG[0xff] 
Jan 24 12:53:43 g241 klogd:  12 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x7] 
Jan 24 12:53:43 g241 klogd: SCB_LUN[0x0] SCB_TAG[0xff] 
Jan 24 12:53:43 g241 klogd:  13 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x7] 
Jan 24 12:53:43 g241 klogd: SCB_LUN[0x0] SCB_TAG[0xff] 
Jan 24 12:53:43 g241 klogd:  14 SCB_CONTROL[0x80]:(TARGET_SCB) SCB_SCSIID[0x67] SCB_LUN[0x0] 
Jan 24 12:53:43 g241 klogd: SCB_TAG[0xff] 
Jan 24 12:53:43 g241 klogd:  15 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x7] 
Jan 24 12:53:43 g241 klogd: SCB_LUN[0x0] SCB_TAG[0xff] 
Jan 24 12:53:43 g241 klogd:  16 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x7] 
Jan 24 12:53:43 g241 klogd: SCB_LUN[0x0] SCB_TAG[0xff] 
Jan 24 12:53:43 g241 klogd:  17 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x7] 
Jan 24 12:53:43 g241 klogd: SCB_LUN[0x0] SCB_TAG[0xff] 
Jan 24 12:53:43 g241 klogd:  18 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x7] 
Jan 24 12:53:43 g241 klogd: SCB_LUN[0x0] SCB_TAG[0xff] 
Jan 24 12:53:43 g241 klogd:  19 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x7] 
Jan 24 12:53:43 g241 klogd: SCB_LUN[0x0] SCB_TAG[0xff] 
Jan 24 12:53:43 g241 klogd:  20 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x7] 
Jan 24 12:53:43 g241 klogd: SCB_LUN[0x0] SCB_TAG[0xff] 
Jan 24 12:53:43 g241 klogd:  21 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x7] 
Jan 24 12:53:43 g241 klogd: SCB_LUN[0x0] SCB_TAG[0xff] 
Jan 24 12:53:43 g241 klogd:  22 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x7] 
Jan 24 12:53:43 g241 klogd: SCB_LUN[0x0] SCB_TAG[0xff] 
Jan 24 12:53:43 g241 klogd:  23 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x7] 
Jan 24 12:53:43 g241 klogd: SCB_LUN[0x0] SCB_TAG[0xff] 
Jan 24 12:53:43 g241 klogd:  24 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x7] 
Jan 24 12:53:43 g241 klogd: SCB_LUN[0x0] SCB_TAG[0xff] 
Jan 24 12:53:43 g241 klogd:  25 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x7] 
Jan 24 12:53:43 g241 klogd: SCB_LUN[0x0] SCB_TAG[0xff] 
Jan 24 12:53:43 g241 klogd:  26 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x7] 
Jan 24 12:53:43 g241 klogd: SCB_LUN[0x0] SCB_TAG[0xff] 
Jan 24 12:53:43 g241 klogd:  27 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x7] 
Jan 24 12:53:43 g241 klogd: SCB_LUN[0x0] SCB_TAG[0xff] 
Jan 24 12:53:43 g241 klogd:  28 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x7] 
Jan 24 12:53:43 g241 klogd: SCB_LUN[0x0] SCB_TAG[0xff] 
Jan 24 12:53:43 g241 klogd:  29 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x7] 
Jan 24 12:53:43 g241 klogd: SCB_LUN[0x0] SCB_TAG[0xff] 
Jan 24 12:53:43 g241 klogd:  30 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x7] 
Jan 24 12:53:43 g241 klogd: SCB_LUN[0x0] SCB_TAG[0xff] 
Jan 24 12:53:43 g241 klogd:  31 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x7] 
Jan 24 12:53:43 g241 klogd: SCB_LUN[0x0] SCB_TAG[0xff] 
Jan 24 12:53:43 g241 klogd: Pending list: 
Jan 24 12:53:43 g241 klogd:  28 SCB_CONTROL[0x0] SCB_SCSIID[0x67] SCB_LUN[0x0] 
Jan 24 12:53:43 g241 klogd: Kernel Free SCB list: 26 29 22 17 10 13 30 6 11 16 35 24 8 18 23 34 20 12 25 2 1 27 9 14 3 0 15 7 21 19 31 5 33 32 
Jan 24 12:53:43 g241 klogd: Untagged Q(6): 28 
Jan 24 12:53:43 g241 klogd: 
Jan 24 12:53:43 g241 klogd: <<<<<<<<<<<<<<<<< Dump Card State Ends >>>>>>>>>>>>>>>>>>
Jan 24 12:53:43 g241 klogd: sr 0:0:6:0: Device is active, asserting ATN
Jan 24 12:53:43 g241 klogd: Recovery code sleeping
Jan 24 12:53:43 g241 klogd: Recovery SCB completes
Jan 24 12:53:43 g241 klogd: Recovery code awake
Jan 24 12:53:43 g241 klogd: aic7xxx_abort returns 0x2002
Jan 24 12:53:44 g241 hal-subfs-mount[3494]: SYMLINKS:: disk/by-path/pci-0000:00:06.0-scsi-0:0:6:0  
Jan 24 12:53:44 g241 hal-subfs-mount[3494]: Can't set extended attribut!
Jan 24 12:53:44 g241 hal-subfs-mount[3494]: MOUNT_POINT:: /media/UL100_U01
Jan 24 12:53:44 g241 hal-subfs-mount[3494]: MOUNTPOINT:: /media/UL100_U01
Jan 24 12:53:44 g241 hal-subfs-mount[3494]: Collected mount options and Called(0) /bin/mount -t subfs -o fs=cdfss,ro,procuid,nosuid,nodev,exec,iocharset=utf8 /dev/sr0 "/media/UL100_U01" 
Jan 24 12:53:44 g241 klogd: ISO 9660 Extensions: Microsoft Joliet Level 3
Jan 24 12:53:44 g241 klogd: ISO 9660 Extensions: RRIP_1991A
Jan 24 12:53:47 g241 klogd: scsi0:A:6: Missed busfree. Lastphase = 0xe0, Curphase = 0xe4
Jan 24 12:53:47 g241 klogd: sr 0:0:6:0: Attempting to queue an ABORT message
Jan 24 12:53:47 g241 klogd: CDB: 0x4a 0x1 0x0 0x0 0x10 0x0 0x0 0x0 0x8 0x0 0x0 0x0
Jan 24 12:53:47 g241 klogd: sr 0:0:6:0: Command not found
Jan 24 12:53:47 g241 klogd: aic7xxx_abort returns 0x2002
Jan 24 12:53:49 g241 klogd: scsi0:A:6: Missed busfree. Lastphase = 0xe0, Curphase = 0xe4
Jan 24 12:53:49 g241 klogd: sr 0:0:6:0: Attempting to queue an ABORT message
Jan 24 12:53:49 g241 klogd: CDB: 0x4a 0x1 0x0 0x0 0x10 0x0 0x0 0x0 0x8 0x0 0x0 0x0
Jan 24 12:53:49 g241 klogd: sr 0:0:6:0: Command not found
Jan 24 12:53:49 g241 klogd: aic7xxx_abort returns 0x2002
Jan 24 12:53:51 g241 klogd: scsi0:A:6: Missed busfree. Lastphase = 0xe0, Curphase = 0xe4
Jan 24 12:53:51 g241 klogd: sr 0:0:6:0: Attempting to queue an ABORT message
Jan 24 12:53:51 g241 klogd: CDB: 0x4a 0x1 0x0 0x0 0x10 0x0 0x0 0x0 0x8 0x0 0x0 0x0
Jan 24 12:53:51 g241 klogd: sr 0:0:6:0: Command not found
Jan 24 12:53:51 g241 klogd: aic7xxx_abort returns 0x2002
[... and so on]

Manually mounting the drive is impossible as long as hald runs.
Comment 1 Danny Al-Gaaf 2006-01-26 14:08:40 UTC
@Kay could you take a look at this ... I have no idea
Comment 2 Kay Sievers 2006-01-26 16:19:48 UTC
Jens, any idea what's going wrong with this device and where to look for?
HAL does every two seconds an:
  open("/dev/sr0", O_RDONLY|O_NONBLOCK|O_EXCL|O_LARGEFILE)
Comment 3 Kay Sievers 2006-01-26 17:12:28 UTC
ok, it's the GPCMD_GET_EVENT_STATUS_NOTIFICATION part, which causes:

--------------
$ dmesg -c
(scsi0:A:6:0): No or incomplete CDB sent to device.
scsi0: Issued Channel A Bus Reset. 1 SCBs aborted
--------------

This test case triggers it. Anything I can fix here?

-------------
struct cdrom_generic_command cgc;
struct request_sense sense;
unsigned char buffer[8];
int fd;

memset (&cgc, 0, sizeof (struct cdrom_generic_command));
memset (&sense, 0, sizeof (struct request_sense));
memset (buffer, 0, sizeof (buffer));
cgc.cmd[0] = GPCMD_GET_EVENT_STATUS_NOTIFICATION;
cgc.cmd[1] = 1;
cgc.cmd[4] = 16;
cgc.cmd[8] = sizeof (buffer);
cgc.timeout = 600;
cgc.buffer = buffer;
cgc.buflen = sizeof (buffer);
cgc.data_direction = CGC_DATA_READ;
cgc.sense = &sense;
cgc.quiet = 1;

fd = open(argv[1], O_RDONLY|O_NONBLOCK|O_EXCL|O_LARGEFILE);
ioctl(fd, CDROM_SEND_PACKET, &cgc);
close(fd);
------------------
Comment 4 Jens Axboe 2006-01-26 19:40:48 UTC
Hmm that looks perfectly valid. I wonder if the cdb len gets set incorrectly in the SCSI layer. Hmm no, it gets 10 which is correct. Kay, can you try the attached test program?
Comment 5 Jens Axboe 2006-01-26 19:41:20 UTC
Created attachment 65305 [details]
SG_IO variant of the test program

Just compile and run as ./geteven <device>
Comment 6 Kay Sievers 2006-01-27 01:34:05 UTC
Ludwig, care to try Jens' test program and see if it still prints the error log.
Comment 7 Ludwig Nussel 2006-01-27 09:40:53 UTC
[0] root@g241:~ # ./sgio /dev/sr0
status 0
0006041600020000
[0] root@g241:~ # eject /dev/sr0
[0] root@g241:~ # ./sgio /dev/sr0
status 0
0006041603010000
[0] root@g241:~ # 

seems to work fine. no errors in /var/log/messages, no hangs.
Comment 8 Kay Sievers 2006-01-27 10:57:54 UTC
Jens, using SG_IO seems to work without the error. Should I switch the code in HAL or is this something to be fixed in the kernel?
Comment 9 Jens Axboe 2006-01-27 12:07:49 UTC
Just switch the code in HAL, it's better to use SG_IO anyways. I don't know what is going wrong with CDROM_SEND_PACKET right now, but we can look at that later.
Comment 10 Kay Sievers 2006-01-27 15:53:06 UTC
Thanks Jens. Switched to SG_IO and comitted it to HAL CVS:
http://cvs.freedesktop.org/hal/hal/hald/linux2/addons/addon-storage.c?r1=1.14&r2=1.15&makepatch=1&diff_format=u
Comment 11 Danny Al-Gaaf 2006-02-02 20:09:58 UTC
Added patch to package.