Bugzilla – Bug 118833
Install to Software Raid fails
Last modified: 2006-06-20 03:13:12 UTC
i used the disk editor during install to create 3 mirrored partitions. sda1+sdb1 = md0 = 128M @ MP /boot FS ext2 sda2+sdb2 = md1 = 256M @ MP <swap> sda3+sdb3 = md2 = 8G @ MP / FS Raiserfs after creating these and installing packages from CD1, the install reboots. while booting from harddrive for the first time it says .... md: raid 1 personality registered as nr 3 waitinmg for device /dev/md1 to appear: ok no record for 'md1' in database Attempting manual resume Kernel panic - not syncing: IO error reading memory image
Please attach /var/log/YaST2 and hwinfo. See http://www.opensuse.org/Bug_Reporting_FAQ#YaST Kernel panic suggests kernel bug
Full boot up messages would be required. But this looks slightly more like a problem with swsuspend; trying to resume from a software raid1 doesn't appear to be working?
Created attachment 50949 [details] content of var/log subdirectory
Created attachment 50950 [details] output of hwsetup
(In reply to comment #1) > Please attach /var/log/YaST2 and hwinfo. > See http://www.opensuse.org/Bug_Reporting_FAQ#YaST > > Kernel panic suggests kernel bug attached. this comes from booting a live cd, mounting /dev/sda3
hello !! hello !! anyone home
My best guess as to the problem here is that the partitions haven't been marked as type FD - Linux raid autostart. If they have, then it would seem to be a 'YaST' bug, probably not running 'raidatuorun' in the right place in the 'init' script in initrd. What is happening is that the init script is asking the kernel to check to see if the is a 'resume' image on /dev/md1 to resume from, the kernel tries to read from /dev/md1 and gets a read-error, and so it panics. This is probably becasue /dev/md1 hasn't been assembled properly yet. Arguably the kernel should not panic at this point but should just fail to resume. However if that wouldn't fix the root problem which is that md1 isn't being assembled at this point. So: please check and report whether the partitions that the mirrorred pairs are made from are set to type FD or not. Thanks, NeilBrown
From the logs I can see that all partitions are correctly created with partition id 0xFD. But is is at all possible to resume from a raid device? /boot/grub/menu.lst contains kernel paramater "resume=/dev/md1", please try with either "resume=/dev/sda2" or with no resume entry at all.
Resume from md 'should' work (though I've never tried it). From the logs I can see that everything you would expect to be needed has happened (raid1.ko has been loaded, /dev/md1 has been created), except for the actually assembly of the raid device. There might be some slightly interesting issues if the array was dirty and needed a resync, but that shouldn't stop it from working. Presumably the 'init' script explicitly loads raid1.ko. Does it run "raidautorun" as well? Could it? Should it?
If one of the modules "raid0 raid1 raid5 linear multipath" is loaded in initrd, raidautorun should also put into initrd and executed after module loading. This seems to work in general otherwise we never could use any md device as root partition and this is quite common and so far I did not hear of any problems with this.
I had a look into the mysteries of 'mkinitrd' and it only causes raidautorun to be run *after* resume has been attempted. So resuming from a raid array currently cannot work. If we wanted to make it work with current technology, I would probably do something like this in the init script created by mkinitrd. case $resumedev in /dev/md* ) realdev=$( mdadm -Es -cpartitions | while read a b c d do if [ "x$a" = "xARRAY" -a "x$b" = "x$resumedev" -a "x$c" = "xlevel=raid1" ] then IFS=,= read a b c echo $b fi done ) if [ -n "$realdev" ]; then resumedev=$realdev; else resume_mode= ; fi esac That should probably go at the start of the udev_discover_resume function. If not this, then at least it should check if resumedev looks like /dev/md*, and ignore it if it does. Who is responsible for making changes to "mkinitrd" ?? NeilBrown
Hannes Reinecke is doing most of the changes in mkinitd. I am pretty sure he will accept patches. I added him to CC:
Hmm. Why do we have to check for 'raid1' partitions only? Shouldn't we consider all raid levels?
Where do we check only for raid1? The check for raid I see in mkinitrd contains: if has_any_module raid0 raid1 raid5 linear multipath; then Raid1 is the only raid level that can be used for the files in /boot but that is nothing that should mother mkinitrd.
Hanes: This bug has been sitting around unloved for some months. Do you know if the important issues have been resolved? Can we close it?
I doubt we'll fix this for 10.0. This change is a bit intrusive and would have to be tested thoroughly. And as this no-one bothered during beta test I don't think it's important enough. So we'll mark this a not supported.
Ok, I'll mark it as WONTFIX for now. If it comes back to haunt us, so be it.