Bugzilla – Bug 113734
Multiple firewire/1394 disk on one controller
Last modified: 2005-10-04 16:34:12 UTC
I have to IEE1394 disk on one controler. The sbp2 module is not loaded on boot nor on hotplug. It is loaded during install. modprobe sbp2 helps a little. One disk is o.k. Plugin/removing the cable to the controler will give problems. It seems, hotplug cannot handle two events at the same time. Only one disk is o.k. the second disk remains scsi4 and is unaccessible. Each plugin of the (USB or) IEEE disk creates a ew scsi. So is there a limit? Why not reuse numbers. Remove scsi5, ad a disk get scsi5 again, not scsi6.
Olaf, can you have a look at this? I'm still not very familiar with ieee1394. sbp2 has one modalias. Should we load that in other cases as well?
yes, there kernel assigns new numbers. why is that a problem? the more important question is, why is sbp2 not loaded? I think you have to enable debug in udev ln -sv dev/shm /events set udev_log to info in /etc/udev/udev.conf and run udevcontrol log_priority=info I need to see the MODALIAS content of the /events/debug.*ieee1394* files.
I have two IEEE-Disks. At now (RC1): Connect: PC - Maxtor - ST325082 daisychain The ST325082 is sdc the Maxtor is sdd sdc1 is mounted on boot sdd1 AND sdd5 are not mounted. Looks like a timing problem. Power-cycling on the Maxtor drive: Now both all partitions are mounted. New bug?? messages before power cycle of external drive: Sep 27 22:22:39 joachim kernel: ieee1394: sbp2: Logged into SBP-2 device Sep 27 22:22:39 joachim kernel: ieee1394: Node 0-00:1023: Max speed [S400] - Max payload [2048] Sep 27 22:22:39 joachim kernel: Vendor: ST325082 Model: 3A Rev: Sep 27 22:22:39 joachim kernel: Type: Direct-Access ANSI SCSI revision: 06 Sep 27 22:22:39 joachim kernel: SCSI device sdc: 488397168 512-byte hdwr sectors (250059 MB) Sep 27 22:22:39 joachim kernel: sdc: got wrong page Sep 27 22:22:39 joachim kernel: sdc: assuming drive cache: write through Sep 27 22:22:39 joachim kernel: SCSI device sdc: 488397168 512-byte hdwr sectors (250059 MB) Sep 27 22:22:39 joachim kernel: sdc: got wrong page Sep 27 22:22:39 joachim kernel: sdc: assuming drive cache: write through Sep 27 22:22:39 joachim kernel: sdc: sdc1 Sep 27 22:22:39 joachim kernel: Attached scsi disk sdc at scsi1, channel 0, id 0, lun 0 Sep 27 22:22:39 joachim kernel: Attached scsi generic sg2 at scsi1, channel 0, id 0, lun 0, type 0 Sep 27 22:22:39 joachim kernel: IA-32 Microcode Update Driver: v1.14 <tigran@veritas.com> Sep 27 22:22:39 joachim kernel: microcode: CPU0 updated from revision 0x7 to 0x1e, date = 06052003 Sep 27 22:22:39 joachim kernel: IA-32 Microcode Update Driver v1.14 unregistered Sep 27 22:22:39 joachim kernel: BIOS EDD facility v0.16 2004-Jun-25, 4 devices found Sep 27 22:22:39 joachim kernel: ieee1394: Node added: ID:BUS[0-01:1023] GUID[0010b9010131819a] Sep 27 22:22:39 joachim kernel: ieee1394: unsolicited response packet received - no tlabel match Sep 27 22:22:39 joachim kernel: ieee1394: Reconnected to SBP-2 device Sep 27 22:22:39 joachim kernel: ieee1394: Node 0-00:1023: Max speed [S400] - Max payload [2048] Sep 27 22:22:39 joachim kernel: scsi3 : SCSI emulation for IEEE-1394 SBP-2 Devices Sep 27 22:22:39 joachim kernel: bootsplash: status on console 0 changed to on Sep 27 22:22:39 joachim kernel: ieee1394: sbp2: Logged into SBP-2 device Sep 27 22:22:39 joachim kernel: ieee1394: Node 0-01:1023: Max speed [S400] - Max payload [2048] Sep 27 22:22:39 joachim kernel: Vendor: Maxtor Model: 1394 storage Rev: v1.3 Sep 27 22:22:39 joachim kernel: Type: Direct-Access ANSI SCSI revision: 06 Sep 27 22:22:39 joachim kernel: SCSI device sdd: 156355584 512-byte hdwr sectors (80054 MB) Sep 27 22:22:39 joachim kernel: sdd: asking for cache data failed Sep 27 22:22:39 joachim kernel: sdd: assuming drive cache: write through Sep 27 22:22:39 joachim kernel: SCSI device sdd: 156355584 512-byte hdwr sectors (80054 MB) Sep 27 22:22:39 joachim kernel: sdd: asking for cache data failed Sep 27 22:22:39 joachim kernel: sdd: assuming drive cache: write through Sep 27 22:22:39 joachim kernel: sdd: sdd1 sdd2 < sdd5 > Sep 27 22:22:39 joachim kernel: Attached scsi disk sdd at scsi3, channel 0, id 0, lun 0 Sep 27 22:22:39 joachim kernel: Attached scsi generic sg3 at scsi3, channel 0, id 0, lun 0, type 0
you did not attach the requested files. one bug after another.
sbp2 is lloaded in RC1 by default, so I cannot reproduce the bug any more. So did I found a new bug now?
I think the MODALIAS= patch was added for rc1, so this part is fixed. can you attach the full hwinfo output?
Created attachment 51077 [details] hwinfo --log hwinfo.log --all At now the 1394 disk are in "working" order sdc/sdd does not depend on the order on the 1394 bus, that is o.k. # cat /proc/mounts rootfs / rootfs rw 0 0 initramfsdevs /dev tmpfs rw 0 0 /dev/sda3 / reiserfs rw 0 0 eventfs /lib/klibc/events tmpfs rw 0 0 proc /proc proc rw,nodiratime 0 0 sysfs /sys sysfs rw 0 0 devpts /dev/pts devpts rw 0 0 tmpfs /dev/shm tmpfs rw 0 0 /dev/sda1 /boot ext2 rw 0 0 /dev/sdb2 /home reiserfs rw 0 0 /dev/hda1 /windows/C ntfs ro,nosuid,nodev,noexec,uid=0,gid=100,umask=02,nls=utf8,errors=continue,mft_zone_multiplier=1 0 0 /dev/hda5 /windows/D ntfs ro,nosuid,nodev,noexec,uid=0,gid=100,umask=02,nls=utf8,errors=continue,mft_zone_multiplier=1 0 0 /dev/hda6 /windows/E vfat rw,nodiratime,nosuid,nodev,noexec,gid=100,fmask=0002,dmask=0002,codepage=cp437,iocharset=iso8859-1,utf8 0 0 /dev/hdb1 /windows/h ntfs ro,nosuid,nodev,noexec,uid=0,gid=100,umask=02,nls=utf8,errors=continue,mft_zone_multiplier=1 0 0 usbfs /proc/bus/usb usbfs rw 0 0 /dev/sdd5 /media/WINTRANS subfs rw,sync,nosuid,nodev 0 0 /dev/sdd5 /media/WINTRANS vfat rw,sync,nodiratime,nosuid,nodev,fmask=0022,dmask=0022,codepage=cp437,iocharset=iso8859-1,utf8 0 0 /dev/sdd1 /media/ieee1394disk subfs rw,sync,nosuid,nodev 0 0 /dev/sdd1 /media/ieee1394disk ext3 rw,sync,nosuid,nodev 0 0 /dev/sdc1 /media/USB-DISC subfs ro,sync,nodev 0 0 /dev/fd0 /media/floppy subfs rw,sync,nosuid,nodev 0 0
ok, so if you unplug one of the firewire disks, the other one can not be accessed anymore? run 'ps axf | grep -w D' If there are any processes in D status, do a echo 1 >/proc/sys/kernel/sysrq echo t > /proc/sysrq-trigger it might be that your kernel dmesg buffer is too small to keep the backtrace of all processes. I suggest you boot with log_buf_len=2m
There are no "D" procs! o.k. sdd is the first disk in chain, sdc the second. Switch on an boot all sdc1, sdc5 mounted subfs+real fs, sdd1 mounted subfs, but error in NTFS: Sep 29 19:52:49 joachim submountd: mount failure, Invalid argument Sep 29 19:52:49 joachim kernel: NTFS-fs error (device sdc1): parse_options(): Unrecognized mount option procuid.nosuid. Sep 29 19:52:49 joachim kernel: subfs: unsuccessful attempt to mount media (256) Switch off sdd, Switch on sdd -> konqui pops up, error as above. switch off sdc: This is the first time I got an Oops, have to rebbot now :-; : messages: Sep 29 19:53:00 joachim kernel: ieee1394: sbp2: aborting sbp2 command Sep 29 19:53:00 joachim kernel: scsi5 : destination target 0, lun 0 Sep 29 19:53:00 joachim kernel: command: Test Unit Ready: 00 00 00 00 00 00 Sep 29 19:53:00 joachim kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000000 Sep 29 19:53:00 joachim kernel: printing eip: Sep 29 19:53:00 joachim kernel: e167d249 Sep 29 19:53:00 joachim kernel: *pde = 00000000 Sep 29 19:53:00 joachim kernel: Oops: 0000 [#1] Sep 29 19:53:00 joachim kernel: Modules linked in: ext3 jbd hfsplus subfs ipt_pkttype ipt_LOG ipt_limit speedstep_lib freq_table pcc_acpi sony_acpi snd_pcm _oss snd_mixer_oss button battery ac af_packet sbp2 usb_storage usbhid edd w83627hf eeprom i2c_sensor i2c_isa nvidia snd_cmipci gameport snd_pcm snd_page_a lloc snd_opl3_lib ohci1394 ieee1394 crc_ccitt isdn slhc snd_timer snd_hwdep snd_mpu401_uart snd_rawmidi 8139too snd_seq_device mii snd soundcore i2c_i801 i 2c_core uhci_hcd usbcore generic intel_agp agpgart pci_hotplug parport_pc lp parport ip6t_REJECT ipt_REJECT ipt_state iptable_mangle iptable_nat iptable_fi lter ip6table_mangle ip_conntrack ip_tables ip6table_filter ip6_tables ipv6 nls_iso8859_1 nls_cp437 vfat fat nls_utf8 ntfs dm_mod reiserfs fan thermal proc essor sg ide_cd cdrom aic7xxx scsi_transport_spi piix sd_mod scsi_mod ide_disk ide_core Sep 29 19:53:00 joachim kernel: CPU: 0 Sep 29 19:53:00 joachim kernel: EIP: 0060:[<e167d249>] Tainted: P U VLI Sep 29 19:53:00 joachim kernel: EFLAGS: 00010082 (2.6.13-9-default) Sep 29 19:53:00 joachim kernel: EIP is at sbp2util_find_command_for_SCpnt+0x19/0x50 [sbp2] Sep 29 19:53:00 joachim kernel: eax: 00000000 ebx: df7f8380 ecx: 00000000 edx: d7f00908 Sep 29 19:53:00 joachim kernel: esi: 00000292 edi: cf7b5000 ebp: c5f1df48 esp: c5f1df2c Sep 29 19:53:00 joachim kernel: ds: 007b es: 007b ss: 0068 Sep 29 19:53:00 joachim kernel: Process scsi_eh_5 (pid: 6958, threadinfo=c5f1c000 task=db886020) Sep 29 19:53:00 joachim kernel: Stack: df7f8380 d7f00880 e167f499 e1680290 00002002 df7f8380 e10246fa 00000000 Sep 29 19:53:00 joachim kernel: 00000000 c5f1df50 c5f1df50 00050000 df7f8380 df7f8490 df7f8430 e1024aa8 Sep 29 19:53:00 joachim kernel: df7f83c8 00000001 df7f8380 c5f1dfa0 c5f1dfb4 c5f1c000 e1024b84 c5f1dfac Sep 29 19:53:00 joachim kernel: Call Trace: Sep 29 19:53:00 joachim kernel: [<e167f499>] sbp2scsi_abort+0x29/0x80 [sbp2] Sep 29 19:53:00 joachim kernel: [<e10246fa>] scsi_send_eh_cmnd+0xba/0x150 [scsi_mod] Sep 29 19:53:00 joachim kernel: [<e1024aa8>] scsi_eh_tur+0x88/0xf0 [scsi_mod] Sep 29 19:53:00 joachim kernel: [<e1024b84>] scsi_eh_abort_cmds+0x74/0xf0 [scsi_mod] Sep 29 19:53:00 joachim kernel: [<e10257b1>] scsi_unjam_host+0x81/0x1c0 [scsi_mod] Sep 29 19:53:00 joachim kernel: [<c0119580>] default_wake_function+0x0/0x10 Sep 29 19:53:00 joachim kernel: [<e10259b3>] scsi_error_handler+0xc3/0x160 [scsi_mod] Sep 29 19:53:00 joachim kernel: [<e10258f0>] scsi_error_handler+0x0/0x160 [scsi_mod] Sep 29 19:53:00 joachim kernel: [<c01012f1>] kernel_thread_helper+0x5/0x14 Sep 29 19:53:00 joachim kernel: Code: c8 5b 5e c3 56 9d 31 c9 5b 89 c8 5e c3 90 8d 74 26 00 56 53 89 d3 9c 5e fa 8d 90 88 00 00 00 8b 80 88 00 00 00 39 c2 74 2b 89 c1 <8b> 00 0f 18 00 90 3b 99 04 01 00 00 74 14 89 c1 8b 00 0f 18 00
Created attachment 51171 [details] All messages from 10.0RC1 (far toomuch?) Start sdc, sdd on switch sdc off unpluged sdd switched sdc on ! nothing happend switched sdd on ! Now sdc is checked, mounted subfs, than sdh[15] are mounted. sdd changed to sdh! (sde-f are cardreader USB)
Created attachment 51172 [details] output from dmesg
yes, changing kernel names are perfectly ok, I guess all goes well if they are not subfs mounted. move the subfs module away and reboot: mv -v /lib/modules/2.6.13*/kernel/fs/subfs/subfs.ko /tmp will look for the oops.
I have added a few ieee1394 patches to the kernel cvs, they are available in a kernel-default with tag 20050930112241 or newer. look in this directory tomorrow: ftp://ftp.suse.com/pub/projects/kernel/kotd/i386/HEAD/
Created attachment 51304 [details] tail of /var/log/messages uname -r: 2.6.13.2-20050930150828-default Boot with both disk on Note: sdc and sdd are connected by 6 pin on sdd, 4 pin on sdc switch of sdc (2nd drive in chain) /media/USB-DISK is gone, o.k. switch on sdc NO /media/USB-DISK BUT see messages line 226 Oct 2 00:43:53 joachim kernel: sbp2: probe of 0050770e501db1ae-0 failed with error -16 switch on/off, cable on/off: all the same result. unpluged all disks: all related /media/... gone o.k. switched on both drives, pluged into CPU (disks are connected) all /media/.. are here again, really nice!! Pluged off sdd power, not ieee cable ALL gone! (see messages 00:57:59, line 345), not o.k. plugged off ieee cable on sdd too pluged in ieee cable on sdd too /media/UBS-DISK is here again. Changeing order, now sdc first (on), sdd second (off) I was too fast, puting the plug out sdd and in sdc, so removed again, waited ~10 sec, pluged in: /media/USB-DISK up again Powerd on sdd: all dirves in /media switched power off sdc: all drives gone, not o.k. There is no signal in dmesg or messages, if ieee cable are unpluged and pluged. On WinXP this killed the explorer! Powerd on sdc: sdd drives in /media sdc failed, same error as at 00:43:53 pluged off + on ieee cable all /media/.. are here again, really nice!! So: First of all: This kernel is stable here! You can get, what you need, just be patient, do not powercycle drives in the running system. Normaly we do not do this these ways I've tested. Connect first all, power on all: works power on all, connect works too. order of connect/power seems not to make problems. Only probelm I could find: Switch off power of drive in the chain, not at the end. It is comparable to WinXP I tested in in Win and Linux playing Music of /dev/sdd5 and then cycled power on sdc. Win: Explored crashed, Winamp died Linux: XMMS did not find any song any longer. ~6000 Lines in dmesg ~15000 lines in /var/log/messages
Created attachment 51305 [details] output from dmesg I did not add the output from the test using xmms, making the drive busy but unaccessible.
I think some (or all?) devices have to be unplugged from the bus to recognize the power cycle. This was mentioned at least in the docu for the LMO drive I have. so everything is ok now?
I thing so, as it works for me :-)
ok, thanks for testing.