Bug 588840

Summary: "mkinitrd" cannot create "initrd", when root is on partitioned MD device.
Product: [openSUSE] openSUSE 11.3 Reporter: Dennis Olsson <DOlsson>
Component: BasesystemAssignee: Michal Marek <mmarek>
Status: VERIFIED FIXED QA Contact: E-mail List <qa-bugs>
Severity: Major    
Priority: P5 - None CC: Olaf.Zander
Version: RC 2   
Target Milestone: ---   
Hardware: x86-64   
OS: openSUSE 11.3   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: Illustration of problem with root partition on MD RAID device.
Requested log from "mkinitrd".

Description Dennis Olsson 2010-03-16 21:17:53 UTC
User-Agent:       Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.0.18) Gecko/2010020400 SUSE/3.0.18-0.1.2 Firefox/3.0.18

During initial installation from DVD with "root" partition onto a partitioned MD device, "mkinitrd" fails during the creation of the "initrd" with the error message (after which the boot loader cannot be installed):

Device /dev/md127p1 not handled

Message is found in line #81 in "/lib/mkinitrd/setup/72-block.sh".

Reproducible: Always

Steps to Reproduce:
1. Boot into rescue system
2. Create a partitioned MD device, e.g. like "mdadm --create /dev/md/raid1 --name="raid1" --homehost="MyPC" --level="raid1" --metadata="1.0" --auto=part4 -v -n 2 /dev/sda /dev/sdb
3. Wait until the MD device have sync'ed up (verify using "cat /proc/mdstat").
4. When MD device has sync'ed up, stop it by using "mdadm --stop /dev/md/raid1".
5. Reboot system
6. Install system using the "Installation" menu from the DVD.
7. By the partitioning, select Customer and select under "RAID" the "md127" and create a root file system on "md127p1".
8. Accept partitioning, and continue with the installation as normal.
Actual Results:  
When the installation process reaches the point where the boot loader is being written, an error message pop-up window will appear with the message that "Device /dev/md127p1 not handled"
issued from "/lib/mkinitrd/setup/72-block.sh".

Expected Results:  
The installation process should finish the writing of the boot loader without any error message.
Comment 1 Dennis Olsson 2010-03-16 21:21:48 UTC
Errata:
The error message in "/lib/mkinitrd/setup/72-block.sh" is to be found in line #122 (line #81 was in openSUSE 11.2).
Comment 2 Michal Marek 2010-03-17 09:54:04 UTC
Is this a different than bug 586837, i.e. are you not using LVM in this case? Can you please install mdadm from https://build.opensuse.org/package/show?package=mdadm&project=Base%3ASystem (or just copy the mkinitrd scripts to /lib/mkinitrd/scripts/{setup,boot}-md.sh) and run
bash -vx /sbin/mkinitrd &>log
and attach the log file?
Comment 3 Dennis Olsson 2010-03-18 19:54:01 UTC
Yes, it is a different bug -- During my testings regarding bug 586837, I forgot to define a LVM volume group and thus installed a root file system directly on the partition.

I will do the requested tests shortly -- I am a bit hung-up currently... :-)
Comment 4 Olaf Zander 2010-04-12 23:49:41 UTC
Is this similar to bug 541684? 
(https://bugzilla.novell.com/show_bug.cgi?id=541684)
Comment 5 Dennis Olsson 2010-04-14 13:42:47 UTC
HHmmm, I am not quite sure about that.

I know of the problem described in bug 541684 (see comment #9), but whether this is related to the bug described here, I cannot say.
In the cases, where I encounter bug 541684, I at least do not get the error message "Device /dev/md127p1 not handled" (or something similar) as I do, when encountering this bug here.
Comment 6 Dennis Olsson 2010-04-14 13:43:29 UTC
Sorry -- forgot to remove "NEEDINFO" in previous commit.
Comment 7 Michal Marek 2010-04-15 10:32:53 UTC
Could you please retest with Milestone 5? setup-md.sh should handle /dev/md*p* there.
Comment 8 Dennis Olsson 2010-05-16 18:29:49 UTC
Have tested with openSUSE 11.3 Milestone 6.

Same error (see attached screenshot).
Comment 9 Dennis Olsson 2010-05-16 18:31:03 UTC
Created attachment 362508 [details]
Illustration of problem with root partition on MD RAID device.
Comment 10 Dennis Olsson 2010-07-06 21:59:09 UTC
Have tested with openSUSE 11.3 RC2.

Seems to work now without any issues.
Comment 11 Dennis Olsson 2010-07-06 23:31:53 UTC
Well, seems I was slightly too fast to claim that it had been fixed.

The problem is that I have created the MD RAID device as "/dev/md/raid1" (resp. "/dev/md/MyPC:raid1"), but "mkinitrd" is *only* creating the device "/dev/md127" in the final installed system!

This discrepancy in naming handling of the "mkinitrd" causes problems later on, while the "/dev/md/raid1" (resp. "/dev/md/MyPC:raid1") does not exist in the finished installed system.

The problem seems to be with the creation of the "/dev/md" devices:

# ll /dev/md
total 0
lrwxrwxrwx 1 root root  8 Jul  7 01:25 MyPC:raid1 -> ../md127
lrwxrwxrwx 1 root root 10 Jul  7 01:25 MyPC:raid1p1 -> ../md127p1
lrwxrwxrwx 1 root root 10 Jul  7 01:25 MyPC:raid1p2 -> ../md127p2
lrwxrwxrwx 1 root root 10 Jul  7 01:25 MyPC:raid1p3 -> ../md127p3

# ll /dev/md127*
brw-rw---- 1 root disk   9, 127 Jul  7 01:25 /dev/md127
brw-rw---- 1 root disk 259,   0 Jul  7 01:25 /dev/md127p1
brw-rw---- 1 root disk 259,   1 Jul  7 01:25 /dev/md127p2
brw-rw---- 1 root disk 259,   2 Jul  7 01:25 /dev/md127p3

Apparently "mkinitrd" follows the symlink of "/dev/md/MyPC:raid1*" instead of using the actual name and thus resulting in only setting up "/dev/md127*" and not "/dev/md/MyPC:raid1*" as well as /dev/md127*"!!??
Comment 12 Michal Marek 2010-07-07 09:40:05 UTC
Can you attach the mkinitrd log (http://en.opensuse.org/Bugs:mkinitrd#Logs)?
Comment 13 Dennis Olsson 2010-07-07 16:23:36 UTC
Created attachment 374317 [details]
Requested log from "mkinitrd".

$ tar -tjvf mkinitrd-log.tbz
-r--r--r-- root/root   1952229 2010-07-07 14:58 0duringInstall-mkinitrd.log
-r--r--r-- root/root       433 2010-07-07 14:59 0duringInstall-partitions
-r--r--r-- root/root   2196727 2010-07-07 16:36 1afterInstall-mkinitrd.log
-r--r--r-- root/root       216 2010-07-07 16:34 1afterInstall-partitions

The "0duringInstall" is a "bash -xv mkinitrd" run taken during the installation of the RPM packages.

The "1afterInstall" is a "bash -xv mkinitrd" run taken after the installation has finished.
Comment 14 Dennis Olsson 2010-07-07 16:28:56 UTC
The MD device was created like this:

# mdadm --create /dev/md/raid1 --name="raid1" --homehost="MyPC" --level="raid1" --metadata="1.0" --auto=part3 -v -n 2 /dev/sda /dev/sdb

# mdadm --examine --scan
ARRAY /dev/md/raid1 metadata=1.0 UUID=ade8dad3:cfd1ea1e:5cd4ba89:54ef4b14 name=MyPC:raid1

# mdadm --stop /dev/md/raid1
# mdadm -A --scan
mdadm: /dev/md/MyPC:raid1 has been started with 2 drives.
# ll /dev/md
total 0
lrwxrwxrwx 1 root root 8 Jul  7 14:29 MyPC:raid1 -> ../md127
# 

After rebooting into Installation, we have:
# ll /dev/md
total 0
lrwxrwxrwx 1 root root 8 2010-07-07 10:34 raid1 -> ../md127

In YaST *only* "/dev/md127" is being displayed -- But no "/dev/md/raid1"!!

It seems the problem with the MD device name comes from the fact that "/proc/partitions" is being used to find the block devices in use, and while this is "/dev/md127*" the naming path "/dev/md/raid1*" is being ignored -- except when looking for "container"s, there the device name is being fetched correctly (mdadm --examine --brief --scan).

After the first reboot during installation, I booted into "rescue" to see, what had been written into "/etc/mdadm.conf":

# cat /mnt/etc/mdadm.conf
DEVICE containers partitions
ARRAY /dev/md/raid1 UUID=d3dae8ad:1eead1cf:89bad45c:144bef54

Interesting, especially the UUID, while doing:

# mdadm --examine --scan
ARRAY /dev/md/raid1 metadata=1.0 UUID=ade8dad3:cfd1ea1e:5cd4ba89:54ef4b14 name=MyPC:raid1

gives another (correct) result!
Notice the UUID -- For some odd/strange reason the UUID entered into the "/etc/mdadm.conf" file seems to be a newly generated UUID which has nothing to do with the UUID found on the MD device!  From where did that come from???

On the other hand, the "/etc/mdadm.conf" file in "initrd" has the contents:

ARRAY /dev/md127 metadata=1.00 name=MyPC:raid1 UUID=ade8dad3:cfd1ea1e:5cd4ba89:54ef4b14

Notice that the UUID here is correct, but the MD device path has been shorten to "/dev/md127" instead of "/dev/md/raid1" (manual named) or "/dev/md/MyPC:raid1".(automagically named by "mdadm -A --scan").


Rebooted the system in order to allow for the installation to finish, and ends up with having "/dev/md/raid1*" as well as "/dev/md127*" exists, *but* after having rebooted the system (once more) _only_ "/dev/md127*" is created!!


During shutdown the error message:

mdadm: error opening /dev/md/raid1: No such file or directory

is printed just after the last "Sending all processes the KILL signal...".


Changing the "/etc/mdadm.conf" file in "initrd" to:

DEVICE containers partitions
ARRAY /dev/md/raid1 metadata=1.0 name=MyPC:raid1 UUID=ade8dad3:cfd1ea1e:5cd4ba89:54ef4b14

results in having "/dev/md/raid1*" as well as "/dev/md127*" nodes created, when system is up, but also results during booting in the error message just after "Creating device nodes with udev":

mdadm: /dev/md127 not identified in config file.

but do not result in any error message from "mdadm" later on during the shutdown process.

Assuming that this error message is due to the fact that the startup scripts in "initrd" are referring to "/dev/md127*" and not to "/dev/md/raid1*", I changed all occurrences of "/dev/md127*" to "/dev/md/raid1*" in the "initrd" and rebooted with this new "initrd".

Unfortunately, the result is that *no* MD devices at all have been created in "/dev"!  In fact, the system is starting up using "/dev/sda*" instead"!!   Sigh.  ;-)


BTW, there is a minor typo in the shutdown message of the MD devices:

/dev/md127p3 umounted		(<== BUG - should be:  unmounted ("n" missing))
Shutting down MD Raid

:-D
Comment 15 Dennis Olsson 2013-05-13 17:57:38 UTC
Under openSUSE 12.3 the creation of named MD RAID devices with partitions work without any issues, thus closing this bug.