Bug 114800

Summary: Installation reports bios raid config on single ide drive (CSB6 chipset) then fails on reboot
Product: [openSUSE] SUSE LINUX 10.0 Reporter: Murlin Wenzel <mwenzel>
Component: InstallationAssignee: Carl-Daniel Hailfinger <kernel01>
Status: RESOLVED FIXED QA Contact: Klaus Kämpf <kkaempf>
Severity: Normal    
Priority: P5 - None CC: kernel01, kstansel, meissner, snwint
Version: Beta 4   
Target Milestone: ---   
Hardware: i686   
OS: SUSE Other   
Whiteboard:
Found By: Third Party Developer/Partner Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: yast logs from initial install phase
Yast and Grub log files
output from hwinfo --block

Description Murlin Wenzel 2005-09-01 17:35:23 UTC
When trying to install Beta4 on an IBM HS20 blade.  Things proceed normally
until the initial probing of hardware/devices after accepting the license terms.
 You will get an error that the bios is reporting a raid config on /dev/hda (the
only drive).  If you accept the warning and continue, you can complete the
initial install, but the system will halt on reboot.  The last message you see
is a panic saying 'Tried to kill init' syncing stopped.  I believe this is
related to the orginal warning/error about the raid config.  There are no
partitions on the drive.  I have even installed other OSs on the drive. 
SLES9-SP2 will install and run fine.
Comment 1 Lukas Ocilka 2005-09-02 06:45:15 UTC
Please, attach YaST logs

http://www.opensuse.org/index.php/Bug_Reporting_FAQ#YaST

thanks
Comment 2 Klaus Kämpf 2005-09-02 08:10:48 UTC
Sounds like the raiddedect from Carl-Daniel. 
 
Was this drive use before in a different system ? May as part of a RAID ? 
Comment 3 Carl-Daniel Hailfinger 2005-09-02 08:31:07 UTC
raiddetect is no longer in our distribution. It has been obsoleted by dmraid.
Comment 4 Murlin Wenzel 2005-09-02 15:56:28 UTC
Created attachment 48613 [details]
yast logs from initial install phase

I hope I got the correct logs.	These are from /var/log/Yast2 and taken right
before the reboot at the end of the initial file copy.

This drive has been in an array in the past, but has since been used standalone
with other os platforms including SLES9-sp2
Comment 5 Thomas Fehr 2005-09-05 08:30:14 UTC
The disk is reported by hwinfo to be part of a BIOS configured raid so 
YaST2 displays this warning. Besides this warning I cannot see any problem
during installation, partitiong, formatting, mounting, rpm installation all
is fine. No idea why rebooting after first stage of instalaltion failed.

Jiri could you have a look if there was something wrong during bootloader setup.
Stefan could it be that the BIOS raid setup causes strange commands executed
in initrd which may make booting fail.
Comment 6 Steffen Winterfeldt 2005-09-05 08:41:09 UTC
The message sounds as if it didn't find an initrd. I don't see the raid 
warning as a problem, too. 
Comment 7 Jiri Srain 2005-09-08 15:15:20 UTC
Hmm, the initrd seems to have been created correctly. Could you, please, 
gram /boot/grub/menu.lst, /boot/grub/device.map and also /var/log/YaST2/* from 
the installed (target) system? They might contain useful information. 
Comment 8 Murlin Wenzel 2005-09-08 16:51:52 UTC
Created attachment 49243 [details]
Yast and Grub log files

I hope I got the correct files.  I had to boot off the cd as a rescue system. 
I was able at that point to mount the hard drive.
Comment 9 Murlin Wenzel 2005-09-20 15:14:58 UTC
Any new ideas on this?
Comment 10 Murlin Wenzel 2005-10-04 23:12:57 UTC
Any new ideas on this?  I've tried the latest build of SL 10 and it still has
the same problem.  Install succeeds, restart fails reporting /dev/hda2 no such
device.
Comment 11 Jiri Srain 2005-10-05 12:59:07 UTC
The GRUB configuration files were created properly and loader was installed 
without any problem. 
 
So, the only idea what can be wrong is some module missing in initrd. 
According to mkinitrd output, modules 
 
serverworks processor thermal fan reiserfs 
 
have been added there. Are they sufficient (for SLES9)? 
Comment 12 Thomas Fehr 2005-10-05 13:28:51 UTC
From the logs all looks ok, partitioning of /dev/hda went fine.
The modules in initrd look also ok.
Maybe it is necessary to disable the raid capability somehow in BIOS to make 
the system boot.
Comment 13 Murlin Wenzel 2005-10-05 17:01:17 UTC
The raid functionality is disabled in the bios.  I can only think of 2 
possibilities as to why the drive is being detected as part of a bios raid.

1.  The pci id has been matched to a raid controller
2.  The actual drive has some transient meta-data which could have been left 
over from a raid config.

I'm going to try booting from the cd again in 'rescue' mode and see if that 
will let me mount the device or at least get some other error/config info.
Comment 14 Murlin Wenzel 2005-10-10 20:45:30 UTC
I can boot to rescue mode and manually mount the storage devices, but they will
not come online during a normal boot.  I also just hit the same problem on an
HS40 blade which uses the same CSB6 ide chipset.
Comment 15 Thomas Fehr 2005-10-11 07:46:44 UTC
Steffen, are you aware of any additional modules loaded in rescue system.
The "serverworks" module is loaded into initrd but apparently this alone
is not enogh to make the disk accessible.

I would assume maybe we need "piix" and/or "ata_piix" in addition to the
"serverworks" module. Could you give it a try by adding first only "piix" and
if this is not enight "piix" and "ata_piix" the the line with "INITRD_MODULES"
in /etc/sysconfig/kernel. You have to do a /sbin/mkinitrd after you changes
to /etc/sysconfig/kernel to create a new initrd file with the new modules added.
Comment 16 Steffen Winterfeldt 2005-10-11 08:37:12 UTC
Don't know. But an 'lsmod' from the rescue system would show. 
Comment 17 Murlin Wenzel 2005-10-11 15:59:29 UTC
I just booted the HS40 in 'resuce' mode and did an lsmod.

dm_mod
e1000
serverworks
fan
thermal
processor
usb_storage
usbhid
ohci_hid
usbcore
ide_disk
ide_cd
ide_core
sg
sr_mod
sd_mod
scsi_mod
cdrom
cramfs
vfat
fat
nls_iso8859_1
nls_cp437
af_packet
nvram

After mounting /dev/hda2; the only difference in the module list was the
addition of reiserfs.
Comment 18 Murlin Wenzel 2005-10-25 14:42:52 UTC
After a lot of kicking and screaming, I managed to convince this system that it really only had 1 ide drive.  I actually had to enable the raid bios, install the system(no complaints about raid drives) then disable the raid bios.  It appears that even though the bios is disabled, the serverworks driver is finding some residual raid info somewhere.  I finally had to format the drive, and null out all the config info in the raid bios.  At least it works now.
Comment 19 Jiri Srain 2005-11-11 09:21:46 UTC
Steffen, any idea how to proceed with this bug?
Comment 20 Murlin Wenzel 2005-11-11 15:40:26 UTC
I just got this same behavior to happen with Sles9 SP3 Beta4 on a system(x225) that doesn't even have any raid support in the bios.  In the previous cases, there was the way to clear all bios raid info and then the systems would function ok.  This particular box has no bios support so there is nothing I can do.  I'm still trying to figure out where the 'bios' raid info is coming from.  This particular box is running and available for remote access if anyone has any other ideas.  You can just contact me for the access info.  The x225 is ICH4 based.
Comment 21 Steffen Winterfeldt 2005-11-16 13:13:59 UTC
I still have no real idea what this report is about.

If you think that yast sees raid devices while there are none, run
'hwinfo --block --log=xxx' and attach the log.
Comment 22 Murlin Wenzel 2005-11-16 18:20:30 UTC
This is the problem when during the beginning of installation and hardware probe, Yast puts up an error message/warning about detecting a disk /dev/hda that is reported as part of an array....  This particular system(x225) doesn't even support IDE raid so I have no idea where the message is coming from.
Comment 23 Murlin Wenzel 2005-11-16 18:21:44 UTC
Created attachment 57550 [details]
output from hwinfo --block

This is the hwinfo output from the x225.
Comment 24 Steffen Winterfeldt 2005-11-17 11:06:41 UTC
That's a sles9 system, not 10.0 as the 'product' field says.

'raiddetect -s' (sles9) resp. 'dmraid -rc' (10.0) report it as softraid.
Comment 25 Murlin Wenzel 2005-11-17 14:52:52 UTC
This was first discovered on SL 10.0, but I can duplicate the same behavior on Sles9 sp3.  It's apparently in code common to both os versions.  We're just trying to figure out why a single device is getting detected as a softraid.
Comment 26 Carl-Daniel Hailfinger 2005-11-21 18:53:28 UTC
The disk in question was probably part of a BIOS raid before it was plugged into the box it is in now.
Comment 27 Murlin Wenzel 2005-11-21 18:59:37 UTC
It's the same drive that has been in the system since testing SLES9 GA.  This never showed up until SP3.  Even after installing and/or formatting the system wil l still come up with the same warning about being part of bios array when you try to re-install.
Comment 28 Murlin Wenzel 2006-01-13 15:56:53 UTC
I've done some digging on this and know at least part of what is going on.  For blades in particular, the ide raid config is a bios setup option that can only be enabled/disabled.  Once it's enabled you can create arrays.  If you have previously defined an array and disable the raid option, you get no access to the raid bios, but SLES9sp3, SL10.0 and SLES10 p4 will still find some residual info somewhere indicating that the devices are in a raid configuration.  Either the class/sub-class in the pci header is not getting set correctly between raid/non-raid or the csb6 driver/kernel code is looking at something else to get the raid info.  If you enable the raid option, clear all config info, then disable the raid option; the errors go away.

I'm still trying to figure out why this happens on the system with no raid option at all.
Comment 29 Marcus Meissner 2007-04-26 12:05:22 UTC
any news? I would like to close if not.
Comment 30 Thomas Fehr 2007-04-26 13:36:33 UTC
We have real dmraid support since 10.2 so this should not happen any more.
The warning mentioned simply does not exist any more.

Nevertheless there may still be boot problems if raid is activated in BIOS but
not used by linux because disk numbering with BIOS/Linux does not match.
Comment 31 Marcus Meissner 2007-04-26 13:39:11 UTC
lets close it then.