|
Bugzilla – Full Text Bug Listing |
| Summary: | system unbootable (attempt to acces beyond end of device) | ||
|---|---|---|---|
| Product: | [openSUSE] openSUSE 10.3 | Reporter: | Miquel A. Noguera <ibz> |
| Component: | Installation | Assignee: | Tejun Heo <teheo> |
| Status: | RESOLVED FIXED | QA Contact: | Jiri Srain <jsrain> |
| Severity: | Blocker | ||
| Priority: | P5 - None | CC: | asklein, coolo, forgotten_eTk6BCeiKJ, forgotten_xI2C5NvggO, michel.munnix, mike, vetter |
| Version: | RC 1 | Flags: | coolo:
SHIP_STOPPER+
|
| Target Milestone: | --- | ||
| Hardware: | i686 | ||
| OS: | openSUSE 10.3 | ||
| Whiteboard: | |||
| Found By: | Beta-Customer | Services Priority: | |
| Business Priority: | Blocker: | --- | |
| Marketing QA Status: | --- | IT Deployment: | --- |
| Attachments: |
hwinfo --all
YaST log files fdisk -l tar -cvzf kmsg.tar.gz kmsg.log libata-fix-set_max_sectors tar -cvzf boot_msg.tar.gz boot.msg |
||
|
Description
Miquel A. Noguera
2007-09-16 09:34:06 UTC
Created attachment 172693 [details]
hwinfo --all
Created attachment 172694 [details]
YaST log files
What does fdisk -l /dev/sda say? Is this the first beta you try or did you had problems before? don't mess around with priorities, please! Created attachment 172695 [details]
fdisk -l
Disk /dev/sda: 80.0 GB, 80025280000 bytes
255 heads, 63 sectors/track, 9729 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x6f99b258
fdisk -l
Device Boot Start End Blocks Id System
/dev/sda1 * 1 3264 26218048+ 7 HPFS/NTFS
/dev/sda2 3265 5875 20972857+ 83 Linux
/dev/sda3 5876 8486 20972857+ 83 Linux
/dev/sda4 8487 9729 9984397+ f W95 Ext'd (LBA)
/dev/sda5 8487 9729 9984366 83 Linux
Disk /dev/sdb: 80.0 GB, 80026361856 bytes
255 heads, 63 sectors/track, 9729 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x00017d2c
Device Boot Start End Blocks Id System
/dev/sdb1 1 523 4200966 82 Linux swap / Solaris
/dev/sdb2 524 9729 73947195 83 Linux
I have tested all releases from alpha5 in 4 different pc's with no boot problems. Beta3-DVD is the first beta I installed in this problematic box. Installation in my laptop (from same media) has been succesfull. Since problematic box has a sis5513 chipset, may be this bug is related to 308384 BTW, I'm running another Beta3 installation in a different box with the same mobo and it boots fine (I don't remember wich media I did use to install) https://bugzilla.novell.com/show_bug.cgi?id=308384 ok, it doesn't sound too severe then. Even though we still may have a driver issue Can you please post kernel dmesg of the failing boot? You can either use netconsole or serial console. Thanks. Not familiar with netconsole/serial console but with keyboard ;-) This is a manually copied version: preping 03-storage.sh running 03-storage.sh preping 04-udev.sh preping 04-udev.sh Creating device nodes with udev preping 05-blogd.sh running 05-blogd.sh preping 11-block.sh running 11-block.sh preping 11-usb.sh running 11-usb.sh preping 21-devinit_done.sh running 21-devinit_done.sh preping 81-kdump.sh running 81-kdump.sh preping 82-resume-userspace.sh running 82-resume-userspace.sh Trying manual resume from /dev/sdb1 Invoking userspace resume from /dev/sdb1 resume: could not stat configuration file resume: libcrypt version: 1.2.4 preping 83-resume.kernel.sh running 83-resume.kernel.sh Trying manual resume from /dev/sdb1 preping 84-mount.sh running 84-mount.sh Waiting for device /dev/sda5 to appear..........Could not find /dev/sda5 Want me to fall back /dev/sda5 ? Waiting for device /dev/sda5 to appear..........not found -- exiting to /bin/sh sh: no job control in this shell Is the log from booting installation media or after installation? If that happens while botting from installation media, please switch to command console (ctrl-alt-f9), plug in a usb memory stick, mount it to /mnt and run the following commands. # cp /var/log/boot.msg /mnt # hwinfo --all > /mnt/hwinfo.log And post the resulting files here. Also, digital cameras are good enough and much less painful when you can't record kernel log remotely. I had a look at the y2log files and could not see any problem there. Parted correctly detected partition sizes. Formatting and mounting of sda5 and sda2 suceeded without problems. /etc/fstab also looks fine: /dev/disk/by-id/scsi-SATA_WDC_WD800JB-00CWD-WCA8E6665404-part5 / ext3 acl,user_xattr 1 1 /dev/disk/by-id/scsi-SATA_WDC_WD800JB-00CWD-WCA8E6665404-part2 /home ext3 acl,user_xattr 1 2 /dev/disk/by-id/scsi-SATA_WDC_WD800JB-00CWD-WCA8E6665404-part1 /windows/C ntfs-3g users,gid=users,fmask=133,dmask=022,locale=es_ES.UTF-8 0 0 /dev/disk/by-id/scsi-SATA_WDC_WD800BB-00FWD-WMAJD2057107-part1 swap swap defaults 0 0 proc /proc proc defaults 0 0 sysfs /sys sysfs noauto 0 0 debugfs /sys/kernel/debug debugfs noauto 0 0 usbfs /proc/bus/usb usbfs noauto 0 0 devpts /dev/pts devpts mode=0620,gid=5 0 0 /dev/fd0 /media/floppy auto noauto,user,sync 0 0 So this should be either an initrd or kernel issue. I have downgraded to 2.6.22.3-7-bigsmp (from Beta2 DVD) and system works again. With the installation media, system boots fine too On beta3, HPA is unlocked by default. That could be causing problems. Miquel, can you please post boot log from the installation media? Also, if you enter partitioning menu during installation, can you see all the partitions okay? From what I can see in the y2log files partitions are detected fine. Last partitions on each disk ends on last disk cylinder as one would expect in a standard setup. Thanks, Thomas. It's not really a driver problem either then. The installation media and installed system use the same kernel. Only initrd is different. If the kernel can detect the device fine when booted from installation media, it should do fine from installed system too. I'd really like to see the failing boot log. Miquel, can you please remove "splash=silent" from kernel boot parameter and take a picture of screen during the failing boot? As I already said, it could also be an initrd issue. This should be decidable by looking into /proc/partition from emergency shell after failed boot. If partitions are present in /proc/partitions it could be an udev issue or a broken initrd. Could be also other drivers being loaded in initrd as were loaded during installation. y2logmkinitrd contains the following: Kernel image: /boot/vmlinuz-2.6.22.5-16-bigsmp Initrd image: /boot/initrd-2.6.22.5-16-bigsmp Root device: /dev/disk/by-id/scsi-SATA_WDC_WD800JB-00CWD-WCA8E6665404-part5 (/dev/sda5) (mounted on / as ext3) Kernel Modules: processor thermal scsi_mod libata pata_sis fan jbd mbcache ext3 edd sd_mod usbcore ohci-hcd uhci-hcd ehci-hcd ff-memless hid usbhid Features: block usb resume.userspace resume.kernel Bootsplash: SuSE (1280x1024) 17880 blocks ERROR: Bootloader::Library::SetLoaderType: Initializing for unknown bootloader ERROR: Bootloader::Core::ListFiles: Running generic function, it should never be called ERROR: Bootloader::Core::ParseLines: Running generic function, it should never be called No idea if the lines starting with ERROR (that are normally not there) have anything to do with this problem. Thanks again, Thomas. Miquel, from the emergency shell, please run # cat /proc/partitions # ls /sys/bus/pci/drivers/ # dmesg and report the result. Thanks. with 2.6.22.3-7 (from media Beta2-dvd, manually installed)
* Everything ok
with 2.6.22.5-10 (from media Beta3-dvd, manually installed)
* Everything ok (but a problem a problem access to dvd drive after media
check screen)
with 2.6.22.5-21 (Factory update)
* Kernel panic
with 2.6.22.5-16 (picked automatically from online repositories during installation)
* cat /proc/partitions
major minor #blocks name
8 0 2653272 sda
8 1 26218048 sda1
8 2 20972857 sda2
8 3 20972857 sda3
8 4 1 sda4
8 16 78150744 sdb
8 17 4200966 sdb1
8 18 73947195 sdb2
* ls /sys/bus/pci/drivers
ehci_hcd imsttfb ohci_hcd pata_sis pcieport-driver serial
* dmesg
sh: dmesg: command not found
* cat /var/log/boot.msg -> as described in comment #12
I see, the kernel is updated during installation. Yeah, this looks like a driver problem. In the emergency shell, please run... # (while read line; do echo $line; sleep .1; done) < /proc/kmsg & This will give you slowly scrolling kernel boot messages. You can also create a directory (/mnt), mount a partition there (probably /dev/sdb2) and redirect the output to a file but due to lack of job control and because /proc/kmsg is emptied once read, it can be a bit tricky. While scolling, you can pause the messages by pressing "Pause/Break" key and resume it by pressing it again. If copying doesn't work, just pick up a digital camera and take shots of the logs and post them here. Thanks. Created attachment 173402 [details]
tar -cvzf kmsg.tar.gz kmsg.log
(while read line; do echo $line >> kmsg.log ; done) < /proc/kmsg &
I have two boxes with the same motherboard, but with different versions from BIOS and also the hard disks are different. In one of them, no kernel > 2.6.22.5-10 works. In the other one nevertheless, the 2.6.22.5-25 works again. As in the problematic machine kernel 2.6.22.5-10 works well, but zypper wants to eliminate it in each update :-( I have workarounde the problem recompiling this kernel in my machine and installing it with make install. *** Bug 326887 has been marked as a duplicate of this bug. *** This bug is also present on RC1 and we can't release SL103 with this bug. Bumping up to BLOCKER. do you have a fix? Not yet. Still looking into the problem. Okay, got it. Will soon post a patched kernel for testing. Please test the following kernel and report the result. http://htj.dyndns.org/kernel-default-2.6.22.5-HPA_debug.i586.rpm Thanks. Created attachment 173823 [details]
libata-fix-set_max_sectors
This is the patch to fix the problem (included in the debug kernel).
kernel-default-2.6.22.5-HPA_debug.i586.rpm boots fine in my problematic box :-) Great job. Thanks. Thanks for testing. I'll forward the patch to mainline and commit it to kernel CVS after getting ACK from internal patch review. Final SL103 will install and run fine on your machine. I will close the bug till the patch is committed. Thanks. Miquel, can you please post /var/log/boot.msg from the debug kernel? Created attachment 173833 [details]
tar -cvzf boot_msg.tar.gz boot.msg
Patch committed. Resolving as FIXED. Thanks. *** Bug 327518 has been marked as a duplicate of this bug. *** *** Bug 327612 has been marked as a duplicate of this bug. *** *** Bug 327513 has been marked as a duplicate of this bug. *** *** Bug 327846 has been marked as a duplicate of this bug. *** *** Bug 329878 has been marked as a duplicate of this bug. *** The link to http://htj.dyndns.org/kernel-default-2.6.22.5-HPA_debug.i586.rpm is producing an "object not found" message. I'm experiencing the same problem with openSUSE 10.3 on a Compaq Armada m700 (coincidentally, an old Novell laptop that was refurbished/resold). I'm not sure about regulations on putting kernel fixes on the Novell server, but it should probably be available for a while. I tried searching for the file on Google, but apparently there were no mirrors. Other than this one problem, openSUSE would have been the absolute easiest OS install I've ever done. If you haven't installed yet, the ISOs in the following directories will help. http://htj.dyndns.org/export/kiso/ If you already have installed, the fix will be delivered to you when you update the kernel. |