Bug 429121 - soft RAID 1 with XFS for / cannot be booted by GRUB
Summary: soft RAID 1 with XFS for / cannot be booted by GRUB
Status: RESOLVED INVALID
Alias: None
Product: openSUSE 11.0
Classification: openSUSE
Component: Basesystem (show other bugs)
Version: Final
Hardware: i586 Other
: P5 - None : Major (vote)
Target Milestone: ---
Assignee: Neil Brown
QA Contact: E-mail List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-09-23 11:47 UTC by Forgotten User Drfk9mafMw
Modified: 2008-11-21 02:42 UTC (History)
1 user (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments
first boot after installation (182.18 KB, image/png)
2008-10-09 16:13 UTC, Forgotten User Drfk9mafMw
Details
next shot (188.33 KB, image/jpeg)
2008-10-09 16:14 UTC, Forgotten User Drfk9mafMw
Details
final shot first boot trial (196.45 KB, image/jpeg)
2008-10-09 16:14 UTC, Forgotten User Drfk9mafMw
Details
second boot trial, first mention of md (249.35 KB, image/jpeg)
2008-10-09 16:16 UTC, Forgotten User Drfk9mafMw
Details
next shot, raid not clean error (261.28 KB, image/jpeg)
2008-10-09 16:17 UTC, Forgotten User Drfk9mafMw
Details
next shot (268.24 KB, image/jpeg)
2008-10-09 16:18 UTC, Forgotten User Drfk9mafMw
Details
final shot, after some seconds of no action (273.18 KB, image/jpeg)
2008-10-09 16:20 UTC, Forgotten User Drfk9mafMw
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Forgotten User Drfk9mafMw 2008-09-23 11:47:19 UTC
As it seems, it is impossible to use XFS on an IDE softRAID1 for the root-filesystem. Partitioning, setup of /dev/md0 and installation work fine, the system is well usable after the installation.

Yet, after the first real reboot the system is no longer bootable. The error message is something as "/dev/md0 cannot be assembled" or "/dev/md0 is invalid".

The identical partitioning and GRUB setup with ext3 for / everything works as expected, so it very much looks like this is a GRUB issue?!

We have an SLES 10 running fine with a soft RAID 1 and XFS for / so this is no general issue -- it should be possible to do the same with openSUSE, too!
Comment 1 Forgotten User Drfk9mafMw 2008-09-23 11:49:04 UTC
The only difference between the SLES 10 setup and the openSUSE 11.0 setup I am playing with is that the servers uses SATA drives as opposed to IDE drives in the openSUSE desktop box...
Comment 2 Neil Brown 2008-10-09 03:20:51 UTC
If it is possible to get exact error message, that would be very helpful.

Maybe with a digital camera?

Comment 3 Forgotten User Drfk9mafMw 2008-10-09 16:13:20 UTC
Created attachment 244671 [details]
first boot after installation
Comment 4 Forgotten User Drfk9mafMw 2008-10-09 16:14:23 UTC
Created attachment 244673 [details]
next shot
Comment 5 Forgotten User Drfk9mafMw 2008-10-09 16:14:58 UTC
Created attachment 244674 [details]
final shot first boot trial
Comment 6 Forgotten User Drfk9mafMw 2008-10-09 16:16:35 UTC
Created attachment 244676 [details]
second boot trial, first mention of md
Comment 7 Forgotten User Drfk9mafMw 2008-10-09 16:17:19 UTC
Created attachment 244678 [details]
next shot, raid not clean error
Comment 8 Forgotten User Drfk9mafMw 2008-10-09 16:18:00 UTC
Created attachment 244679 [details]
next shot
Comment 9 Forgotten User Drfk9mafMw 2008-10-09 16:20:00 UTC
Created attachment 244680 [details]
final shot, after some seconds of no action

Sorry for the bad quality of the shots but the messages run through too quickly.

Also, with 11.1 Beta2 I can't even install at all, the installation hangs when installing the grub rpm-package... I will file a separate bug report for that.
Comment 10 Neil Brown 2008-11-07 05:08:04 UTC
Sorry for the long delay in replying.

This looks like a bug in mdadm which I thought had been fixed.
mdadm versions before about April 2008 could create arrays badly
so that the 'bitmap' intersected with the data or metadata.
The kernel would detect this and report exact the error that you
are seeing.

If you created the array with the 11.0 install disk, then I am surprised
and I thought 11.0 was released after the bug was fixes. However I could
be wrong.

Looking back over your description, it could be that the install disk
had an older kernel which did not detect the overlap, but that the kernel
which was installed did.
In that case you might be able to:

  boot with the install disk and select the 'rescue' option.
  get a shell
  assemble the array is it isn't already assembled
  remove the bitmap with "mdadm --grow --bitmap=none /dev/md0"
  reboot

That should then boot successfully.

Once booted with all uptodate software, you can add the bitmap back
with 
   mdadm --grow --bitmap=internal /dev/md0

Please let me know how you go - or whether you given up and done something 
else with the machine :-)
Comment 11 Forgotten User Drfk9mafMw 2008-11-07 10:47:16 UTC
I have given up on this and the machine in productive use with 11.0 softRAID 1 with ext3. This works.

However, it has a free partitions which I use for testing of upcoming 11.1. Here, the issue is even worse (ever since Beta1, and at least until Beta3. Beta4 I haven't tried so far):

bug #433980

I can't tell if there is any mdadm bug in this, because the installation on an XFS partition fails, no matter where I put grub.

Installing on ext3 with a softRAID configuration works, though.
Comment 12 Neil Brown 2008-11-21 02:42:58 UTC
Based on the last comment, I think this problem isn't occurring 
in current releases, or is hidden by some other bug that makes this impossible
to proceed with.

So I'm closing this bug now.  Please reopen if you have new information.