|
Bugzilla – Full Text Bug Listing |
| Summary: | grub not installed when selected target only root on md RAID1 logical | ||
|---|---|---|---|
| Product: | [openSUSE] openSUSE 13.1 | Reporter: | Felix Miata <mrmazda> |
| Component: | Installation | Assignee: | Steffen Winterfeldt <snwint> |
| Status: | RESOLVED FIXED | QA Contact: | Jiri Srain <jsrain> |
| Severity: | Enhancement | ||
| Priority: | P4 - Low | CC: | duwe, forgotten_eDStDj8Y1e, jreidinger, moby |
| Version: | Final | ||
| Target Milestone: | --- | ||
| Hardware: | x86 | ||
| OS: | openSUSE 13.1 | ||
| Whiteboard: | |||
| Found By: | --- | Services Priority: | |
| Business Priority: | Blocker: | --- | |
| Marketing QA Status: | --- | IT Deployment: | --- |
| Attachments: |
save_y2logs output
/var/log/messages from 11.1 on host big31 output of 'dd if=/dev/sda9 of=sda9bs3.bin bs=512 count=1' M2 y2logs from comment 0 system pre-kexec first "boot" 11.4M6 y2logs y2logs from 13.1 host big31 |
||
I opened YaST2 bootloader in 11.1 and found only "custom boot partition" selected. The custom partition selected is illegible, as the YaST2 layout is broken for >96 DPI screens - the field is too short to display enough information about what is selected. I saved after switching selection from custom to root, redid dd on sda9 and sdb9, found Grub in them, but still get error 13 trying to chainload to the 11.1 RAID1 partition. /var/log/boot.msg for 11.1 always shows "Failed features: boot.md; Skipped features: boot.cycle". OK, I look in logs what happen. Grub error 13(Invalid or unsupported executable format) looks like you have corrupted kernel images. I study logs and you have interesting disc setup ;) Why is set boot_custom is due to missing ability to parse dm raid from grub.conf, but it is not fatal (only confuse gui). failing md raid I don't know why...what subpackage of kernel you have -base, normale or -extra? please attach messages, this should include information why boot.md fail. Also try run grub install and after it please post what contain mbr of partition where grub write stage1 code. Thanks Created attachment 266878 [details] /var/log/messages from 11.1 on host big31 (In reply to comment #4) > I study logs and you have interesting disc setup ;) > Why is set boot_custom is due to missing ability to parse dm raid from But I have md RAID. Is md RAID missing too? > grub.conf, but it is not fatal (only confuse gui). Confused me. What means "only confuse gui"? > failing md raid I don't know why...what subpackage of kernel you have -base, > normale or -extra? RPM query shows pae, pae-base and pae-extra of .27-9.1 installed. > please attach messages, this should include information why > boot.md fail. Since my previous comments boot.md no longer fails on either 11.0 or 11.1. Created attachment 266879 [details] output of 'dd if=/dev/sda9 of=sda9bs3.bin bs=512 count=1' (In reply to comment #4) > Also try run grub install and after it please post what contain mbr of > partition where grub write stage1 code. I checked to find as of 30 Dec 21:17 /etc/grub.conf contains 3 lines: setup --stage2=/boot/grub/stage2 --forcelba (hd0,8) (hd0,8) setup --stage2=/boot/grub/stage2 --forcelba (hd1,8) (hd0,8) quit I then ran 'grub-install'. I see the PBR now contains Grub code. The behavior is changed. When I try to use the 10.2 Grub on sda1 to select the following stanzas: title chainload to /dev/hda9 (1 line) chainloader (hd0,8)+1 title chainload2 to /dev/hda9 (2 lines) root (hd0,8) chainloader +1 title chainload3 to /dev/hda9 (noverify 2 line) rootnoverify (hd0,8) chainloader +1 All return me to a (non-gfx) grub menu from sda1. If I drop to Grubs command line and enter 'chainloader (hd0,9)+1' or 'root (hd0,8)' 'chainloader +1' manually, I get error 13. I tried deleting the HPFS sda2 and creating IBM Boot Manager in its place. Selecting the 11.0 md1 or 11.1 md2 partitions returns a "Selected partition is not formatted" message. That message is the usual result of a missing Grub. DFSee confirms the absence of Grub on (hd0,8), so I again did 'dd if=/dev/sda9 of=sda9bs4.bin bs=512 count=1' after rebooting, and the result contains only nulls. So, what I hypothesize is happening is when grub "writes" during setup to a partition that is part of md RAID, it goes to some buffer that gets cleared before reboot instead of actually being written to disk. OK, then you maybe hit parted bug, which zeroed some partition. Do you ran before reboot yast2 bootloader or yast2 disk or directly use parted? and answers confused gui means that it doesn't check boot from root and check boot custom md and dm is handled almost same (sorry for confuse I often change these two raid types :) I just notice that init shows error for boot.md, but it does not show up in /var/log/messages. (In reply to comment #7) > OK, then you maybe hit parted bug, which zeroed some partition. Do you ran > before reboot yast2 bootloader or yast2 disk or directly use parted? I did not do: yast2 bootloader yast2 disk parted So I did just now: 1.# grub-install 2.# yast bootloader (made minor edits to default stanza) 3.# dfsee (confirmed presence of Grub on (hd0,8)) 4.reboot 5.get error 13 trying to chainload (hd0,8) 6.reboot 11.1 from sda1 stanza (without chainloading) 7.# dfsee (confirmed Grub missing from (hd0,8) again) 8.checked /etc/grub.conf to see it is now 2 lines again: setup --stage2=/boot/grub/stage2 --forcelba (hd1,8) (hd0,8) quit 9.restored 3 line grub.conf 10.edited grub.conf to do hd0,7 & hd1,7 11.# grub-install (while md1/hd0,7&hd1,7 unmounted) 12.# dfsee (confirmed presence of Grub on (hd0,7)) 13.reboot 14.get 11.0's Grub menu by chainloading (hd0,7) 15.boot 11.0 successfully So, it appears that, as long as an md device is mounted, Grub can't be successfully written to its partitions via native Grub setup command. OK, so what break your configuration is step 2, because also minor change in yast bootloader call parted (it is bug, which is fixed and maintenance for it is in stack). parted bug is tracked in bug 467576 and bootloader unnecessary call of parted in bug 461613 *** This bug has been marked as a duplicate of bug 461613 *** Bug 461613 does not show having ever been an 11.0 bug, but this bug is identical in both 11.0 and 11.1. I'll be surprised if the fix for that bug fixes this bug, because as I indicated in comment 8, even native grub doesn't work, and AFAIK it never calls parted. OK, reopen it....I overlook that it happen also on 11.0 torsten - do you have any idea why error 13 shown when try to chainload? Josef, it seems rather obvious to me. Grub is not actually writing to the disk, as the PBR is always empty upon boot, though it seems non-empty after running Grub's setup. If it's RAID1, this is probably a duplicate of bnc#462578 . RAID0 is unsupported. You might get it to work if you find a 100% corresponding BIOS device (-> fake RAID), but bnc#462578 would still apply then . OK, I mark that this bug depends on bug 462578 and after fix, we should try, if problems gone. I have run into the same problem. I built a system with 10.3 (64 bit) and then attempted to upgrade to 11.1. Grub was built to boot from a floppy -- no other options in the menu. So, I started from scratch with 11.1. Dual 500GB SATA drives and dual 64bit AMD CPU. Upon finishing, grub thinks it is supposed to boot from a floppy! So I changed to only have the /home be RAID1. Both drives were partitioned the same (sda/b1 same size, with sda1 mounted as /boot, sda/b2 6.1 GB swap, sda/b3 mounted as /). 11.1 finished and wanted to boot from floppy. So I ran the install again to do repairs, and then it correctly set the MBR and grub and the boot worked (with /home mirrored with software RAID1). I did some tests with openSUSE 11.2 and with the first Milestone of openSUSE 11.3. There is not problem with configuration of GRUB. GRUB is written correctly. The problems with overwritten GRUB stage1 by parted during set boot flag are solved also other problems with parted were solved. IMHO all symptoms are fixed. I don't see an obvious way to test whether this is "fixed" in M2. There is no apparent option to install Grub to / on /dev/md3. For boot loader location I see only: 1-MBR (where I always use standard boot code) 2-"Enable Redundancy for MD Array" 3-Custom Boot Partition Option 2's help only speaks of MBR redundancy. MBR in my case is not to be touched. If I try put /dev/md3 in Custom, I get a popup telling me "Selected custom bootloader partition /dev/md3 is not available any more. Set default boot loader location?" IOW, booting from / (on /dev/md3 in this case) is missing from the list, and not available from bootloader configuration. If I click on "Boot Loader Installation Details", all I see is a list of my two SATA devices under "Disk Order". When I exit "booting", I get a message in red telling me "The boot device is on software RAID1. Select another bootloader location, e.g. Master Boot Record" whether or not I have there clicked to enable boot from / partition. I proceeded after "enabling" boot from /. Upon inspection after first/kexec "boot" the PBR of /dev/md3 seems to remain empty. Both 'dd if=/dev/md3 of=md3pbr.bin count=1 bs=512' and 'dd if=/dev/sda10 of=sda10pbr.bin count=1 bs=512' produce files full of nulls. Created attachment 343689 [details] M2 y2logs from comment 0 system pre-kexec first "boot" (In reply to comment #18) > I don't see an obvious way to test whether this is "fixed" in M2. There is no > apparent option to install Grub to / on /dev/md3. For boot loader location I > see only: > > 1-MBR (where I always use standard boot code) > 2-"Enable Redundancy for MD Array" > 3-Custom Boot Partition > > Option 2's help only speaks of MBR redundancy. MBR in my case is not to be > touched. If I try put /dev/md3 in Custom, I get a popup telling me "Selected > custom bootloader partition /dev/md3 is not available any more. Set default > boot loader location?" > > IOW, booting from / (on /dev/md3 in this case) is missing from the list, and > not available from bootloader configuration. > > If I click on "Boot Loader Installation Details", all I see is a list of my two > SATA devices under "Disk Order". > > When I exit "booting", I get a message in red telling me "The boot device is on > software RAID1. Select another bootloader location, e.g. Master Boot Record" > whether or not I have there clicked to enable boot from / partition. > > I proceeded after "enabling" boot from /. Upon inspection after first/kexec > "boot" the PBR of /dev/md3 seems to remain empty. Both 'dd if=/dev/md3 > of=md3pbr.bin count=1 bs=512' and 'dd if=/dev/sda10 of=sda10pbr.bin count=1 > bs=512' produce files full of nulls. Ah, OK. not it is clear. Problem is quite clear now. Missing option is because we don't support writing boot code into software raid mirrored partition, because this code different in a few bites which cause that raid think that partition is not properly mirrored and always report that mirror is broken, so we allow only write to MBR ( and also allow redundancy, so if you remove one disc it should still work). So you should do two things: 1) install to MBR and everything should work 2) do non-mirrored boot partition, from which you boot (so separate /boot on one disc ) Or am I misunderstand problem? I think we understand each other. I will not install Grub to MBR unless temporarily to test something. I do have non-md partition with Grub to use to boot, but do not mount as /boot because it is master bootloader I maintain and to not want touched by installation scripts. This bug is that yast bootloader shows option "Enable Redundancy for MD Array" that does not work because it requires Grub to do what as you say Grub1 cannot do without causing apparent RAID degradation. Help for yast bootloader installation says that redundancy is only for MBR Grub, which supports your assertion that Grub cannot be written to md partition without causing degradation. So I think best fix for this must wait for either Grub 2 support or modification of md RAID driver so some Grub can work on md partition without degrading any array. Until then, "Custom Boot Partition" option for "Boot Loader Location" must be allowed flexibility for expert/manual management of a master boot loader that will not be mounted as /boot, and that installation scripts will not touch. This might mean that selecting none of the "Boot Loader Location" options must be allowed. I cannot tell now running yast bootloader (on 11.2) while booted whether it is or not without disrupting other work in progress. I can tell that the select button on the "Custom Boot Partition" input line is ignored when I click it with mouse. 11.2 is newest I have installed on either of my only two md RAID systems. I could temporarily put Factory/11.4M? on md2 on #2/backup box to test possible fixes, but probably not too soon. OK, I agree. I think that y2-bl part is beside future grub2 support (which is handled via openfate I think), mainly allow more expert setting. Created attachment 411140 [details] 11.4M6 y2logs I just did an M6 install to the comment 0 multiboot system with target / on md2. I unchecked "Boot from Master Boot Record", checked "Enable Redundancy for MD Array", checked "Custom Boot Partition", and left blank the space below custom boot partition. Both my HDs were corrupted with Grub code by the installation, giving me no option to boot something else, or not to boot at all. I had to CAD to escape from proceeding into 11.4, which I was not prepared yet at that time to do, and had to boot a CD to restore my MBRs. The only reason I don't uncheck Grub from package selection, and uncheck everything except custom boot partition in boot loader settings, is I cannot otherwise expect the installer to provide me with a Grub stanza or whatever cmdline is appropriate for a Grub stanza for my real boot loader, or for a complete valid menu.lst to use with the real boot loader's configfile stanza. (In reply to comment #23) > Created an attachment (id=411140) [details] > 11.4M6 y2logs > > I just did an M6 install to the comment 0 multiboot system with target / on > md2. I unchecked "Boot from Master Boot Record", checked "Enable Redundancy for > MD Array", checked "Custom Boot Partition", and left blank the space below > custom boot partition. Both my HDs were corrupted with Grub code by the > installation, giving me no option to boot something else, or not to boot at > all. I had to CAD to escape from proceeding into 11.4, which I was not prepared > yet at that time to do, and had to boot a CD to restore my MBRs. > > The only reason I don't uncheck Grub from package selection, and uncheck > everything except custom boot partition in boot loader settings, is I cannot > otherwise expect the installer to provide me with a Grub stanza or whatever > cmdline is appropriate for a Grub stanza for my real boot loader, or for a > complete valid menu.lst to use with the real boot loader's configfile stanza. I check logs. Of course you can get grub command line, just go into console (alt+F2 e.g) and install bootloader manually. It is possible and only way for lilo bootloader. I think it works meanwhile. (In reply to comment #26) > I think it works meanwhile. Possibly it's OK with Grub2. I think likely not exactly using Grub Legacy. Maybe it's good as it can be WRT Grub Legacy though? I was tired when doing a host big31 13.1 installation from dvd iso late last night, so have only fuzzy recollection of the details. Except for the ending dd section of comment 18, this installation went pretty much the same if not exactly as described there. My best recollection is that nothing improved over my 11.4 installation last reported here, or the 12.2 installation I never mentioned here. As Josef mentioned it's still unsupported to install Grub Legacy to RAID1 /, so still 13.1 has no option available to install to the / filesystem. I proceeded in spite of the red warning that the bootloader target selected was not acceptable, and at bootloader installation step got a failure message as with prior installations. In spite of having deselected installation to MBR, this is the content of /etc/grub.conf: setup --stage2=/boot/grub/stage2 --force-lba (hd0) (hd0) quit On the 11.4 system on which I am now writing this, which test host big31 is a virtual clone of, /etc/grub.conf contains: setup --stage2=/boot/grub/stage2 --force-lba (hd0,8) (hd0,8) setup --stage2=/boot/grub/stage2 --force-lba (hd1,8) (hd1,8) quit The difference may represent a negative change since I opened this bug. It seems "Enable Redundancy for MD Array" selection needs to be absent or disallowed unless MBR is selected first. That might contribute to a usable /etc/grub.conf being created. The menu.lst stanzas as written could not boot: title openSUSE root (hd0) kernel /boot/vmlinuz root=/dev/... initrd /boot/initrd ... Yet, at first boot I needed only to enter Grub Legacy's edit mode and s/root (hd0)/root (hd0,8)/ (1/2 of md1) to boot into the newly installed 13.1 system on md1, which had had 11.0 installed originally, and 12.2 after 11.4 had been installed to md2 (hd0,9&hd1,9). Prior to installation I booted 11.4 and ran mkfs.ext4 on the target md1, which means the disk sectors to which Grub had previously been installed and functional were not disturbed. Thus it seems this might actually be fixable in the sense that YaST could do the whole job in a fashion that manual intervention on first boot is unnecessary. -> reopen as 13.1 until such time as I can configure a system without Grub having previously been written to the target mdX, solely to try again with Factory and confirm the rewritten in Ruby YaST hasn't changed anything materially. Created attachment 586139 [details]
y2logs from 13.1 host big31
raid1 installs have been fixed in the sles11 sp2 or sp3 branch. Unfortunately the changes never made it into the openSUSE branch. :-( Now we have all the focus on grub2. And I can't say when I will have time to look into this. well, old opensuse is out of support and for newer we support only grub2. So closing. |
Created attachment 262841 [details] save_y2logs output I fully partitioned 2 SATA HD on ICH7 prior to booting 11.0 and 11.1 installers using Grub with previously downloaded linuxes and initrds. I used Knoppix to install openSUSE 10.2 Grub and ChristmasTux message on ext2 sda1, and configured menu.lst to be able to boot a default kernel from either (hdX,7) or (hdX,8) using ROOT=LABEL=, or by chainloading to either (hdX,7) or (hdX,8). During 11.0 boxed DVD installation expert partitioning I selected sda1 to mount on /disks/hda/boot and various non-native partitions to mount as sdX via default. I then set up sdX7-13 as md0-6, specified md1 (sdX8) for 11.0 /, and formatted all md devices ext3, including assigning labels. I specified md1 / as the only Grub installation location, and mounting Linux partitions by-label. Installation proceeded normally into successful kexec "reboot". No succeeding boot via chainloading from sda1 to / has succeeded. When I try, I get Grub error 13. Booting is only possible so far via the preinstalled 10.2 Grub and menu.lst on sda1. The day after 11.0 install to md1 I did a HTTP 11.1 install to md2 (sdX9). The results are exactly the same as with 11.0. In both 11.0 & 11.1, /etc/grub.conf (2 lines, 66 bytes for 11.1) and /etc/grub.conf.old (3 lines, 126 bytes for 11.1) exist. The 3 line files refer to both (hd0,X) and (hd1,X), while the 2 line files actually used by the installer only refer to (hd0,X). I used dd to dump the first sectors of sda8, sda9, sdb8, sdb9 & md2 to files. All 5 512 byte files contain nothing but nulls, which to me seems to mean the 11.0 & 11.1 installers never actually succeeded in installing Grub to their / partitions, even though /var/log/YaST2/y2log_bootloader claim they did. I have md3 reserved to install 11.2 Factory as soon as a working .28 kernel installer is available.