Bug 603161

Summary: 11.3 / FACTORY ISOs don't boot in FACTORY's kvm
Product: [openSUSE] openSUSE 11.3 Reporter: Stefan Seyfried <seife>
Component: BasesystemAssignee: Bruce Rogers <brogers>
Status: VERIFIED FIXED QA Contact: E-mail List <qa-bugs>
Severity: Normal    
Priority: P2 - High CC: agraf, novellbmw, snwint
Version: Factory   
Target Milestone: ---   
Hardware: x86-64   
OS: Other   
Whiteboard: maint:released:11.3:36169 maint:released:sle11-sp1:36090 maint:released:11.2:36170
Found By: Community User Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Deadline: 2010-08-30   
Attachments: seabios patch that fixes this issue

Description Stefan Seyfried 2010-05-06 08:26:44 UTC
I'm trying to boot 11.3M6 in a KVM on a FACTORY installed machine.
(thinkpad x200s, core2 duo L9400, vmx capable, kvm-0.12.2-3.13.x86_64)

Unfortunately, I never get past the first second of graphical boot menu, then qemu-kvm starts spinning with 100% CPU load and nothing happens anymore.

I'm starting qemu like this:

qemu-kvm -no-quit -m 1024 -hda opensuse-11_3m6.img \
    -monitor stdio \
    -net nic,model=e1000 -net user,hostname=opensuse-11_3m6 \
    -soundhw ac97 -cdrom ../openSUSE-NET-Build0577-i586.iso

I tried -no-kvm and -nographic, which both did not help anything.
It seems to stop ~1 second into the boot menu, if I hit "space" early enough, I see the boot selection screen (but cannot select anything), if I don't, it stays at the "welcome" screen forever.

11.2 ISOs work fine, as do Fedora12 and debian 5.04

It also does work with kvm-78.0.10.6-0.1.1 on an openSUSE 11.1 host.

I'm not sure who's to blame - if the graphical boot code is buggy (but then, why does it work on older kvm), or if kvm is broken, but it is obvious, that nobody ever tried using KVM on FACTORY ;)
Comment 1 Stefan Seyfried 2010-05-06 11:18:35 UTC
Using kvm-0.12.3 from BS project Virtualization does not change the behaviour.
Comment 2 Stefan Seyfried 2010-05-07 09:29:50 UTC
I have built kvm-0.11.0 (from openSUSE 11.2) for FACTORY, and when using that, I can boot the 11.3m6 iso without problems.
Comment 3 Steffen Winterfeldt 2010-05-10 13:27:47 UTC
Stefan, does http://www.suse.de/~snwint/bnc_599478/test_01.iso work for you?
Comment 4 Stefan Seyfried 2010-05-10 13:38:17 UTC
No, does not help.
Comment 5 Alexander Graf 2010-05-11 14:10:42 UTC
Does it work with current qemu-kvm.git? How about -no-kvm?
Comment 6 Stefan Seyfried 2010-05-11 15:01:07 UTC
-no-kvm does not help - see first comment ;)

do you have an URL for qemu-kvm.git? I'll have to build it in OBS to test it...
Comment 7 Stefan Seyfried 2010-05-11 16:03:41 UTC
still happens with
QEMU 0.12.50 monitor - type 'help' for more information
(built with SUSE patches forward ported)

Next test without SUSE patches...
Comment 8 Stefan Seyfried 2010-05-11 16:15:31 UTC
qemu-kvm.git build without SUSE patches also fails the same way.

(to be honest: without patches %patch06 onwards, %patch01 - %patch05 still apllied ;)

built version was kvm-88-4629-gabdc56b
Comment 9 Alexander Graf 2010-05-11 20:40:00 UTC
kvm-88 is ancient :). Please try git://git.savannah.nongnu.org/qemu.git and use -enable-kvm as parameter. To configure only i386 as target, use ./configure --target-list=i386-softmmu (or x86_64-softmmu for x86_64).
Comment 10 Stefan Seyfried 2010-05-12 07:49:18 UTC
does the bug not show on your system?
Comment 11 Bernhard Wiedemann 2010-05-14 13:18:57 UTC
I am also seeing this problem on MS6 and later isos.

qemu-0.12.3 fails
kvm-0.12.3 fails
vanilla qemu-git fails
qemu-0.11.0 works
qemu-0.12.3 (and every variant) with openSUSE-KDE-LiveCD-Build0556-i686.iso works

This is not SUSE-specific and not KVM-specific.

How to reproduce:
qemu-kvm -cdrom openSUSE-NET-Build0577-i586.iso
or
qemu-system-x86_64 -cdrom openSUSE-NET-Build0577-i586.iso

Other Information:
I tried qemu -s and gdb with "target remote localhost:1234" but as I do not have much experience with this setup, I did not get far. It looks as if it is remaining in a busy-loop.
During the working animation part, registers were

eax            0x0      0
ecx            0xf000f  983055
edx            0x282b   10283
ebx            0x190101 1638657
esp            0x1d2b8  0x1d2b8
ebp            0x3      0x3
esi            0x191eb1 1646257
edi            0x191eb2 1646258
eip            0x77f5   0x77f5
eflags         0x46     [ PF ZF ]
cs             0x18     24
ss             0x8      8
ds             0x20     32
es             0x8      8
fs             0x8      8
gs             0x8      8

but after it hung, it was

eax            0x11     17
ecx            0xf000f  983055
edx            0x2837   10295
ebx            0x190101 1638657
esp            0x1fc4   0x1fc4
ebp            0x3      0x3
esi            0x191f0d 1646349
edi            0x191f0e 1646350
eip            0x7838   0x7838
eflags         0x6      [ PF ]
cs             0x1114   4372
ss             0x1b2e   6958
ds             0x1114   4372
es             0x296    662
fs             0xffff   65535
gs             0xffff   65535
Comment 12 Bruce Rogers 2010-05-14 15:13:41 UTC
Sorry I haven't been on board with this one - I can reproduce it and will start debugging. Thanks for all your efforts to date.
Comment 13 Bruce Rogers 2010-05-17 16:27:32 UTC
What I'm seeing is that the breakage started in M6 for the *-NET-*.iso images.

I'm also suspecting there may be some incompatibility with qemu/qemu-kvm and the current gfxboot.c32 module of isolinux - that is the main thing that changed from M5 to M6, as far as what would be affecting qemu/qemu-kvm booting this iso. (It previously was gfxboot.com, so switch from 16 to 32 bit, and possibly other changes.

Steffen W., would you care to comment and look into this a bit from the perspective of the isolinux components?
Comment 14 Steffen Winterfeldt 2010-05-17 16:59:37 UTC
It is true that now a com32 module is used. But that shouldn't change much
except for the memory layout. I've no idea what could cause the difference.
Comment 15 Bruce Rogers 2010-05-17 17:08:47 UTC
(In reply to comment #13)
> What I'm seeing is that the breakage started in M6 for the *-NET-*.iso images.
> 
> I'm also suspecting there may be some incompatibility with qemu/qemu-kvm and
> the current gfxboot.c32 module of isolinux - that is the main thing that
> changed from M5 to M6, as far as what would be affecting qemu/qemu-kvm booting
> this iso. (It previously was gfxboot.com, so switch from 16 to 32 bit, and
> possibly other changes.
> 
> Steffen W., would you care to comment and look into this a bit from the
> perspective of the isolinux components?

Correction - breakage started in milestone 5, not milestone 6.
Comment 16 Bernhard Wiedemann 2010-05-17 17:45:16 UTC
I just remastered openSUSE-KDE-LiveCD-i686-Build0584-Media.iso and found that adding only
boot/i386/loader/gfxboot.com from Build0500
made it working again with qemu-0.12.3. The bug is in qemu, as older qemu versions (and physical PCs) still work with gfxboot.c32.
It would probably not be hard to bisect this regression as the bug is so easy and fast to trigger.
Comment 17 Bernhard Wiedemann 2010-05-17 19:09:38 UTC
bisected it:
fd646122418ecefcde228d43821d07da79dd99bb is first bad commit
commit fd646122418ecefcde228d43821d07da79dd99bb
Author: Anthony Liguori <aliguori@us.ibm.com>
Date:   Fri Oct 30 09:06:09 2009 -0500

    Switch pc bios from pc-bios to seabios
    
    SeaBIOS is a port of pc-bios to GCC.  Besides using a more modern tool chain,
    SeaBIOS introduces a number of new features including PMM support, better
    BEV and BCV support, and better PnP support.
    
    Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>

:100644 100644 71494ea488dcb4838a6f2629e10fb923fe0f8929 dd4745e78f7a5f1712d8b58b38e1481b949ab6d3 M      .gitmodules
:040000 040000 29a512d943f026f7b1b829f4ebbd6e4944b6df46 6850a7825af29cb77885b81fef44814460be2ad7 M      pc-bios
:040000 040000 fec11bf3423cb0a669325347d8be9687fec43ba5 038fa59b6df0a99a86b02e050006d219be1d8350 M      roms
Comment 18 Bruce Rogers 2010-05-17 19:13:06 UTC
You beat me to it! ;-)
Well, I know there were some known issues with this switch, but I think that they were all assumed to have fairly low exposure.  I'll try to debug to see exactly where were hanging up.
Comment 19 Bruce Rogers 2010-05-17 19:25:04 UTC
FYI: I bisected to the same commit.
Comment 20 Bruce Rogers 2010-05-17 22:48:44 UTC
I hadn't checked that this was in fact still broken in the lastest incarnation of upstream qemu. It is in fact fixed - by the latest seabios rom.

I've attached the seabios patch which implemented the fix.

I would hope we could update to the latest qemu, kvm, and virt-utils in the next day or two, including seabios. In that event we'd get a fix along with a lot of other fixes.

If not this fix could be applied separately.
Comment 21 Bruce Rogers 2010-05-17 22:50:06 UTC
Created attachment 362793 [details]
seabios patch that fixes this issue
Comment 22 Alexander Graf 2010-05-17 23:00:40 UTC
This is in 0.12.4?
Comment 23 Bruce Rogers 2010-05-18 00:12:14 UTC
This fix came in after 0.12.4.
Comment 24 Alexander Graf 2010-05-18 09:14:23 UTC
Then I don't understand comment 20. I definitely veto against going anywhere non-0.12. Also, keep in mind that 0.12.4 has a known bug with migration - IIUC only for 0.12.2 -> 0.12.4, but you never know. So I'd wait until 0.12.4.1 is out in this case.

That means we should only cherry-pick known good commits for the time being.
Comment 27 Bernhard Wiedemann 2010-06-10 21:07:07 UTC
We do not want to leave this open in 11.3

meanwhile, one simple workaround is to replace bios.bin with the version from 0.12.4 or HEAD
Comment 28 Bruce Rogers 2010-06-10 21:10:14 UTC
I should be able to get this updated shortly.
Comment 29 Bruce Rogers 2010-06-15 20:05:09 UTC
Fixed kvm checked into devel project, next stop: Factory
Comment 30 Bruce Rogers 2010-06-15 20:16:19 UTC
submitted to Factory (sr 41515)
Comment 32 Bruce Rogers 2010-07-30 15:45:49 UTC
Fix is also applied to SLES 11 SP1 kvm, for inclusion in first maintenance update.
Comment 33 Swamp Workflow Management 2010-08-02 14:32:04 UTC
The SWAMPID for this issue is 34941.
This issue was rated as low.
Please submit fixed packages until 2010-08-30.
Also create a patchinfo file using this link:
https://swamp.suse.de/webswamp/wf/34941
Comment 34 Swamp Workflow Management 2010-10-13 11:31:20 UTC
Update released for: kvm, kvm-debuginfo, kvm-debugsource
Products:
openSUSE 11.3 (debug, i586, x86_64)
Comment 35 Swamp Workflow Management 2010-10-13 14:15:52 UTC
Update released for: kvm, kvm-debuginfo, kvm-debugsource, kvm-kmp-trace, kvm-kmp-vmi
Products:
SLE-DEBUGINFO 11-SP1 (i386, x86_64)
SLE-DESKTOP 11-SP1 (i386, x86_64)
SLE-SERVER 11-SP1 (i386, x86_64)
Comment 36 Swamp Workflow Management 2010-10-28 13:39:21 UTC
Update released for: kvm, kvm-debuginfo, kvm-debugsource
Products:
openSUSE 11.2 (i586, x86_64)
Comment 37 Bruce Rogers 2010-11-01 16:54:41 UTC
Closing as Fixed.
Comment 38 Bernhard Wiedemann 2016-04-15 11:42:20 UTC
This is an autogenerated message for OBS integration:
This bug (603161) was mentioned in
https://build.opensuse.org/request/show/41515 Factory / kvm