Bug 129301

Summary: i2o and dpt_i2o drivers are conflicting
Product: [openSUSE] SUSE LINUX 10.0 Reporter: Glen Kaukola <glen>
Component: YaST2Assignee: Greg Kroah-Hartman <gregkh>
Status: RESOLVED WONTFIX QA Contact: Klaus Kämpf <kkaempf>
Severity: Normal    
Priority: P5 - None CC: admin, felix, pokeytemplar, simon.held, snwint
Version: Final   
Target Milestone: ---   
Hardware: x86   
OS: Other   
Whiteboard:
Found By: Other Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: Output of hwinfo on ASUS A8V-E SE with Adaptec 2100S on plain SuSE 10

Description Glen Kaukola 2005-10-19 08:19:22 UTC
Suse 10 installs fine, but when it reboots to finish up the installation it
hangs.  I see the following:

I2O subsystem v1.288
i2o: max drivers = 8
i2o: Checking for PCI I2O controllers
ACPI: PCI Interrupt 0000:03:05.0[A] -> GSI 18 (level,low) -> IRQ 169
iop0: controller found (0000:03:05.0)
PCI: Unable to reserve mem region #1:100000@f1000000 for device 0000:03:05.0
iop0: device already claimed
ACPI: PCI interrupt for device 0000:03:05.0 disabled
dpti0: Trying to abort cmd=3288
Comment 1 Hubert Mantel 2005-10-19 15:17:48 UTC
Are you using the driver already at install time? And is this a SMP machine? Did you try the various ACPI related kernel parameters such as "acpi=off" or "pci=noacpi"? Did you try a "failsafe" boot?
Comment 2 Glen Kaukola 2005-10-19 18:15:59 UTC
I'm installing right onto my RAID, so yes I'm using the driver already at install time.  I didn't try the noacpi option, but I did try failsafe.  It didn't seem to make a difference.
Comment 3 Roger Gardner 2005-10-21 06:48:28 UTC
I am having the same problem. I have tried "acpi=off", "pci=noacpi" and "fix_hstcfg=1. I also notice a couple of other problems.
1.The following message appears just before the message for i2o:
  piix4_smbus 0000:00:of.0: Illegal Interrupt configuration (or code out of date)!
2.While booting I had seen the following message a hundred or more times:
  Fatal: Could not load /lib/modules/2.6.13-15-default/modules.dep: no such file or directory
  This I do not understand. Because I am using 2.6.13-15-smp kernel. So I made a link from 2.6.13-15-default to 2.6.13-15-smp. I do not see this message any more.
I hope someone can us.
Comment 4 Ralf Prengel 2005-10-25 08:15:33 UTC
No help but same problem here
Primergy E200 with Adaptec onboard Raidcontroller
Comment 5 Olaf Kirch 2005-11-15 11:08:05 UTC
Greg, can you have a look please?
Comment 6 Greg Kroah-Hartman 2005-11-15 21:58:48 UTC
Are you using the i2o device for your raid?

Can you attach the output of running 'hwinfo' as root?
Comment 7 Ralf Prengel 2005-11-16 09:00:59 UTC
(In reply to comment #6)
> Are you using the i2o device for your raid?
> 
> Can you attach the output of running 'hwinfo' as root?

No chance for me (Ralf Prengel) because the Raid is the boot-device.
It' not possible to boot a working system.
Comment 8 Glen Kaukola 2005-11-16 18:36:30 UTC
Yeah, I'm using my i2o device for RAID 0.

I'm not running suse though obviously, and Fedora doesn't seem to have hwinfo.  Would that command happen to be available during the install?

If you're just interested in what devices I have, my motherboard is a Supermicro P4DC6+.  The SCSI controller is an Adaptec AIC-7899W which is built into the motherboard.  And my RAID controller is an Adaptec 2005S Zero Channel Card.
Comment 9 Marcel Hilzinger 2005-11-29 09:08:43 UTC
Please read this:
http://whocares.de/archive/000940.php
Comment 10 Greg Kroah-Hartman 2005-12-01 00:45:09 UTC
Ah, thanks for the link.  this looks like an installer issue, reassigning.
Comment 11 Glen Kaukola 2005-12-01 00:55:15 UTC
Well at first it seemed like that link had a fix for me.  I went to install Suse once more last night.  After the first cd did its thing, but before it booted the system to install from the rest of the cds, I moved the i2o directory to i2o.bad just like the article suggested.  I then booted up my system normally and the Suse install finished up fine.  I thought I was golden.  But today, I go to turn on my pc and it just hangs again.  And once again I see the same errors about the i2o device.

I did update my system, but I swear it didn't install a new kernel as I was watching closely.  And I only see one module directory in /lib/modules/.  Perhaps I can get this thing working though if I switch from the dpt_i2o driver to the regular i2o driver, the one inside the directory I moved.  In fact I'm pretty sure I saw a way to do that right inside yast.  I'm going to give that a shot and see what happens.
Comment 12 Klaus Kämpf 2005-12-01 08:37:39 UTC
Hmm, the installer loads the modules claiming support for a device.

Workaround: Choose 'manual' installation and when YaST asks to load driver say "yes" to the i2o and "no" to the adaptec one (or vice versa)
(-> http://i2o.shadowconnect.com/faq.php#dpt_i2o)

Of course the real fix is to give priority to either the i2o or the adaptec. 
For this we need hwinfo however.

So please boot from CD1 and switch to console2 as soon as YaST is starting. Run hwinfo there and copy its result to an usb stick (or mount a partition and copy it there).
Comment 13 Steffen Winterfeldt 2005-12-01 10:55:58 UTC
Renaming modules won't work. modprobe will find them anyway. Either
remove them or put them into /etc/hotplug/blacklist.

In any case, all reports agree that installation works fine but after a
_reboot_ things break. So that would be a kernel/hotplug bug.

Can't driver authors come up with pci id tables that make sense?
Comment 14 Olaf Kirch 2005-12-02 10:10:02 UTC
Greg, this does indeed seem to be a conflict of PCI IDs between i2o and
dpt_i2o. This should be fixed in mainline too I guess.
Comment 15 Xuan Hai Dang Le 2005-12-21 06:21:03 UTC
I think this is your right answer :
http://readlist.com/lists/suse.com/suse-linux-e/0/2279.html
Comment 16 Greg Kroah-Hartman 2005-12-23 21:15:48 UTC
As both drivers have overlapping device id tables (as they both seem to support
the same devices), this needs to be a modules blacklist issue, which will
solve the boot-time problem.

Back to the yast group...
Comment 17 Urs Mueller 2005-12-24 20:42:05 UTC
Created attachment 61762 [details]
Output of hwinfo on ASUS A8V-E SE with Adaptec 2100S on plain SuSE 10
Comment 18 Urs Mueller 2005-12-24 20:46:27 UTC
As somebody asked for a hwinfo-output, I did attach mine.
I could do a plain install with the hints on this page and the linked sites.
I changed to the second console and "rmmod" the dpt_i2o and replaced it by i2o_core and i2o_block.
Before the first reboot I added "i2o_core" and "i2o_block" to the initrd and added the "dpt_i2o" in the /etc/hotplug/blacklist.
I am now going to try the updates (incl kernel)...
Comment 19 Steffen Winterfeldt 2006-01-09 16:27:43 UTC
Ok, dpt_i2o has two pci id entries. One is the one from this report, which
apparently does not work; the other is duplicated in i2o_core, so I'd
guess someone else already found it doesn't work either.

So it looks like dpt_i2o is good for nothing and we should not only blacklist
it but remove the driver entirely to reduce support load.

Christian, the blacklist is in your package, feel free to add dpt_i2o.
Comment 20 Felix Möller 2006-01-16 17:06:59 UTC
Hi, I have an Adaptec 2100S RAID-Controller. 
The installation starts up fine, but after the restart the system does not boot up anymore. It hangs with an dpti0 error.

I am using 10.1 alpha4.  

I tried removing the whole i2o directory as mentioned in http://whocares.de/archive/000940.php, but that did not help at all.  

I read through #114718, but that did not help.

Is there any way to fix this for 10.1? Would any info be helpfull?
Comment 21 Christian Zoz 2006-01-16 21:31:40 UTC
To comment 10: Added dpt_i2o to blacklist.
Comment 22 Steffen Winterfeldt 2006-01-17 10:24:35 UTC
Why is it assigned to me? This bug is fixed.

Felix, you have to blacklist the module you don't want loaded. If you're
unsure which one you need, start the install and look which one was used
there (as apparently things worked at that point).
Comment 23 Felix Möller 2006-01-17 12:54:13 UTC
I do not even have /etc/hotplug/ so it seems the installer did not install the hotplug package, how can I blacklist it then? Maybe this is intended as i just chose a minimal text-mode install.

During installation I have the following modules loaded:
ext3 130056 1 - Live 0xf8c76000
jbd 59936 1 ext3, Live 0xf8c3f000
3c59x 42792 0 - Live 0xf8e53000
mii 5632 1 3c59x, Live 0xf8e1f000
dpt_i2o 31644 1 - Live 0xf8e4a000
via82cxxx 9092 0 [permanent], Live 0xf8d69000
fan 4484 0 - Live 0xf887c000
thermal 13064 0 - Live 0xf8e1a000
processor 22336 1 thermal, Live 0xf8d4b000
usb_storage 70336 0 - Live 0xf8e62000
usbhid 45664 0 - Live 0xf8e3d000
uhci_hcd 31504 0 - Live 0xf8e25000
usbcore 119172 4 usb_storage,usbhid,uhci_hcd, Live 0xf8dbb000
ide_disk 16896 0 - Live 0xf8d63000
ide_cd 39300 0 - Live 0xf8db0000
ide_core 121248 4 via82cxxx,usb_storage,ide_disk,ide_cd, Live 0xf8dfb000
sg 35872 0 - Live 0xf8d6f000
sr_mod 16036 0 - Live 0xf883f000
sd_mod 18320 2 - Live 0xf8d52000
scsi_mod 130920 5 dpt_i2o,usb_storage,sg,sr_mod,sd_mod, Live 0xf8dda000
cdrom 37408 2 ide_cd,sr_mod, Live 0xf8d58000
cramfs 42740 0 - Live 0xf8850000
vfat 12800 0 - Live 0xf884b000
fat 49436 1 vfat, Live 0xf886e000
nfs 209640 1 - Live 0xf8d7b000
nfs_acl 3840 1 nfs, Live 0xf8828000
lockd 58248 2 nfs, Live 0xf885e000
sunrpc 141372 4 nfs,nfs_acl,lockd, Live 0xf8d1a000
nls_iso8859_1 4096 0 - Live 0xf8806000
nls_cp437 5760 0 - Live 0xf883c000
af_packet 21256 2 - Live 0xf8844000
nvram 8328 0 - Live 0xf8824000

My devices are:
00:0a.0 PCI bridge: Adaptec (formerly DPT) PCI Bridge (rev 02)
00:0a.1 I2O: Adaptec (formerly DPT) SmartRAID V Controller (rev 02)
-
00:0a.0 Class 0604: 1044:a500 (rev 02)
00:0a.1 Class 0e00: 1044:a501 (rev 02)

All my tests are done with openSuSE 10.1 alpha4.
Comment 24 Steffen Winterfeldt 2006-01-17 13:16:55 UTC
For 10.1, it's /etc/modprobe.d/blacklist.
It's working with dpt_i2o, so maybe you want to block i2o_core instead, then.
Comment 25 Felix Möller 2006-01-18 12:35:54 UTC
Ok, it is working after I added i2o_core to the blacklist file. Is there anything I can provide to get this fixed for SuSE 10.1?  

As we do not want to block it in general, this probably has to be done by the installer, doesn't it?
Comment 26 Steffen Winterfeldt 2006-01-18 12:53:50 UTC
10.1 should be fine already. Hopefully.
Comment 27 pokey templar 2006-05-29 22:06:11 UTC
It is not working in 10.1 either.  Computer crashes randomly during install of 10.1 if using cd/DVD after installation if I do a FTP install
Comment 28 Felix Möller 2006-05-30 05:37:53 UTC
(In reply to comment #27)
> It is not working in 10.1 either.  Computer crashes randomly during install of
> 10.1 if using cd/DVD after installation if I do a FTP install

That is really interesting. Our system is crashing randomly too.  It installs fine. But it doesn't stay on longer than 2 hours or so.

The system hangs with no cause while under small load.
Usually i am connected through ssh the connection just dies. And on the local console the keyboard is not working properly, just garbage is displayed.

I have to reboot the whole system to get it working again.
Comment 29 Klaus Kämpf 2006-05-30 09:56:39 UTC
comment #27, #28: Are you booting with "safe settings" ? Did you run a memory check on your system ?
Random crashes are mostly caused by buggy hardware.
Comment 30 Glen Kaukola 2006-05-30 20:03:21 UTC
I've been messing with pretty much all the 10.1 betas and kept getting the same thing.  The thing would install but then once the install was done it would freeze up and random times.  I've yet to try 10.1 final but from the sound of it I'd get the same thing.  I'm pretty sure it's not a hardware problem.  I've run memtest86+ on my system numerous times in the past.  I've also had Mandriva 2006 going for quite a while, and before that various other distros running for years without problems.  Suse isn't the only distro though.  I also get the same strange behavior with Fedora Core 5.  Between Suse and Fedora, something has recently changed that's causing this.
Comment 31 Klaus Kämpf 2006-05-31 08:38:31 UTC
Received by private mail from 'pokey templar'
"Ran Memtest for 24 hours with no errors.  Have low-level formatted all of my
drives and recreated the RAID arrays.  Regressed to SuSE 9.3 and everything
works perfectly.  I had this problem before with the 8.x series of SuSE
where my RAID lost support (8.3?) and then reappeared with the 9.x series
(it was a highpoint controller). "
Comment 32 Felix Möller 2006-06-07 00:59:22 UTC
NEEDINFO was set to me, but I really do not know what to provide.

What information might be useful to solve this issue?

Does anybody have a system with this controller running stable?

pokey templar, Glen Kaukola what are the hardware specs of your systems? dual cpu?
Comment 33 Lars Marowsky-Bree 2006-06-27 21:06:07 UTC
Jens, in bug #176735, we blacklisted i2o_block and i2o_scsi in support.conf and recommended that people were to use dpt_i2o. In this bug here, apparently the YaST2 group has done the exact opposite. I think we need a block device expert to figure this one out.
Comment 34 Achim Mildenberger 2006-07-10 11:07:26 UTC
I also encountered the severe stability problems (see #27)
when using
shadowconnects i2o_block drivers. Setup: SuSE 10.1, Adaptec 2010S.
Possible solution: There is a patch for kernel 2.6.16 
(see http://i2o.shadowconnect.com/download.php and
http://i2o.shadowconnect.com/changes.php )
Seems to be something with Huge TLBs in this driver isn't conform to 2.6.16.



Comment 35 Greg Kroah-Hartman 2006-12-20 05:41:31 UTC
This should all be addressed in the 10.2 release.

If not, please reopen it with the needed information.