Bug 390837 - network installation with vncviewer command crashes installation process
Summary: network installation with vncviewer command crashes installation process
Status: RESOLVED DUPLICATE of bug 389386
: 404688 (view as bug list)
Alias: None
Product: openSUSE 11.0
Classification: openSUSE
Component: Installation (show other bugs)
Version: Beta 3
Hardware: i586 openSUSE 11.0
: P5 - None : Major (vote)
Target Milestone: ---
Assignee: Stefan Dirsch
QA Contact: Jiri Srain
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-05-15 15:10 UTC by Yi Xu
Modified: 2008-07-02 06:25 UTC (History)
2 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments
the script for setting up grub (22.59 KB, text/plain)
2008-05-15 15:12 UTC, Yi Xu
Details
PNG from console 4 (516.34 KB, image/jpeg)
2008-05-16 10:00 UTC, Yi Xu
Details
crash log (10.38 KB, text/plain)
2008-05-16 13:34 UTC, Yi Xu
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Yi Xu 2008-05-15 15:10:10 UTC
Doing remote installation, using a script that writes a section into grub with the SLP installation source. Boot the host to the installation source, start "vncview $hostname:1" on my workstation, and then a vnc windows pops up for less then half a second, and disappeared.
Go to the installation host, on the monitor there's red line "an error occurred during installation".
Try again with different resolutions: 1024x768, 800x600, Normal. Didn't help. Still crashes installation process.
Nevertheless, when trying from browser http://$hostip/5801 it works fine and could finish installation.
The script will be attached. It is trustful, has been used since years.
Comment 1 Yi Xu 2008-05-15 15:12:45 UTC
Created attachment 215636 [details]
the script for setting up grub
Comment 2 Andreas Jaeger 2008-05-15 15:14:56 UTC
This is not a blocker.
Comment 3 Steffen Winterfeldt 2008-05-15 15:49:52 UTC
a) does it work without vnc?
b) what's the error message (n console 3)?
Comment 4 Yi Xu 2008-05-15 16:24:06 UTC
a) yes
b) didn't notice. Will try again.
Comment 5 Yi Xu 2008-05-15 16:27:17 UTC
(In reply to comment #4 from Yi Xu)
> a) yes

No. not always.
Just saw a test host which is stalled at "sending DHCP request to eth0", and then red screen: "Could not find openSUSE repository, activate manual set up".

> b) didn't notice. Will try again.
> 

Comment 6 Martin Mrazik 2008-05-15 16:46:15 UTC
Yi: I don't understand why this was re-assigned to bnc-screening team.
Steffen: reassigning back to you
Comment 7 Yi Xu 2008-05-16 08:37:42 UTC
(In reply to comment #6 from Martin Mrazik)
> Yi: I don't understand why this was re-assigned to bnc-screening team.
> Steffen: reassigning back to you
> 

Oh, I clicked reassign to default assignee. Seems that doesn't work although someone already took the bug.
Comment 8 Steffen Winterfeldt 2008-05-16 09:06:32 UTC
Yi, I'll need the log messages from console 3. Or boot with
linuxrc.debug=1 linuxrc.log=/foo and attach the log.

Right now I'd say it's a setup error on your side.
Comment 9 Yi Xu 2008-05-16 09:51:36 UTC
linuxrc.debug=1 linuxrc.log=/tmp/linuxrc.log didn't write any log.
Comment 10 Yi Xu 2008-05-16 10:00:18 UTC
Created attachment 215897 [details]
PNG from console 4
Comment 11 Steffen Winterfeldt 2008-05-16 12:00:53 UTC
To comment 9: That can't be. Either you made a spelling error or didn't pass
the option.
Comment 12 Yi Xu 2008-05-16 12:20:32 UTC
You can ssh neme2.suse.de (or colfax.suse.de) to check boot menu.
Comment 13 Yi Xu 2008-05-16 12:42:09 UTC
I have installed numerous times of OS on this host with the same script and same method (vncviewer), and only this time it crashed.
Comment 14 Steffen Winterfeldt 2008-05-16 13:01:09 UTC
Sorry, Yi, I will not look at your boot config. What matters are
the things going on during installation. Make at least a screenshot of
console 3.
Comment 15 Yi Xu 2008-05-16 13:15:55 UTC
There was no message on console 3, just some information on console 4.
Comment 16 Yi Xu 2008-05-16 13:33:06 UTC
However after fixing network configuration on colfax, starting vncviewer crashed installation, but log file was kept. Will be attached.
Comment 17 Yi Xu 2008-05-16 13:34:18 UTC
Created attachment 215973 [details]
crash log
Comment 18 Steffen Winterfeldt 2008-05-16 14:05:56 UTC
*** glibc detected *** /usr/bin/Xvnc: free(): invalid next size (normal): 0x0000000000b0bfb0 ***

(backtrace is in log)
Comment 19 Stefan Dirsch 2008-05-16 14:31:08 UTC
This is a duplicate of a bugreport I needed to close as WORKSFORME since I couldn't reproduce it. This behaviour won't change when trying to reproduce again.
Comment 20 Steffen Winterfeldt 2008-06-30 16:40:16 UTC
*** Bug 404688 has been marked as a duplicate of this bug. ***
Comment 21 Andrew Joakimsen 2008-06-30 16:46:18 UTC
I reported bug 404688. In my case using the java VNC viewer located at ipaddress:5801 *DOES NOT* work.

Stefan: What version of openSUSE did you attempt to reproduce this bug on and using what hardware? What are the exact parameters you passed to the booatloader?
Comment 22 Andreas Jaeger 2008-07-01 07:43:55 UTC
Undoing last change.
Comment 23 Stefan Dirsch 2008-07-01 09:11:21 UTC
Adreas mixed up Steffen with Stefan. Reassigning to Steffen.
Comment 24 Steffen Winterfeldt 2008-07-01 09:36:20 UTC
Sorry Stefan, but AFAIK it is Xvnc that crashes (see comment 18). No idea
what I could do about it.
Comment 25 Stefan Dirsch 2008-07-01 09:40:48 UTC
See my comment #19.
Comment 26 Andrew Joakimsen 2008-07-01 10:04:49 UTC
Has been verified by myself and:

Yi Xu <yxu@novell.com>
Brian K. White <brian@aljex.com>

You have not provided any details as to how you determined that this "works for you" (what hardware, disc, bootloader options, etc) nor any further details of the mysterious bug you allegedly closed as "WORKSFORME."
Comment 27 Andrew Joakimsen 2008-07-01 10:13:41 UTC
Also what OS and VNC viewer are you connecting from?
Comment 28 Stefan Dirsch 2008-07-01 10:39:41 UTC
I tried any combination of

Server: 11.0-i386/11.0-x86_64
Client: 11.0-i386/11.0-x86_64/10.3-i386/10.3-x86_64 (tightvnc)

No, I don't have available for testing here.

BTW, the bug has been filed against Beta3. Not sure if it has been tested with the final version as well.
Comment 29 Andrew Joakimsen 2008-07-01 14:12:45 UTC
I reported bug # 404688 against 11.0 final but it was rejected. The issue persists  in the final release.
Comment 30 Stefan Dirsch 2008-07-01 14:19:16 UTC
Not sure what's the use of reopening the bugreport when it's clear that I can't reproduce it. Since I can't reproduce it I can't fix it either. ==> WONTFIX
Comment 31 white brian 2008-07-01 18:23:21 UTC
I don't even use the vnc install option myself so I'm not highly motivated to fix it or see that it gets fixed either, but I can confirm it's definitely broken in the final release 11.0 when trying to install onto 2 different Dell PowerEdge 1550. These are dual p3 with 1G registered ecc ram.

I have not tried the 11.0 vnc installer on other machines.
I have tried the vnc installer from opensuse 10.3 on the same machines and it worked fine.

I have installed 10.3 and 11.0 using everything but vnc (serial console, ssh, and regular direct gui console) and those install methods all worked fine and the resulting installed os shows no problems.

If I can find any way to spare any of my copious free time, I'll try to find a machine that doesn't produce the same results.

There is one suspect thing. The machines display no problems running linux/freebsd/dos/freedos, but they do cause memtest86+ 2.01 to display an error that a tiny bit of ram at the top end of the physical address space is bad. It doesn't matter how much ram is installed or what physical chips are installed or in what order, the error is always the last few K at the top end.
And, both machines do the same thing, and I've read other people describing the same thing for years on these particular machines and other similar ones from Dell. So, that appears to be some kind of bios shadowing.

My install kernel, initrd, and oss repository all came from this script, run and re-run several times well after the final release of 11.0.

[code]
SYNC="rsync -a --del --delete-excluded --exclude ppc --exclude ppc64 --exclude src --exclude SRPMS $@"

# 11.0
$SYNC rsync://mirrors.kernel.org/mirrors/opensuse/distribution/11.0/repo/oss /opt/SUSE/11.0
$SYNC rsync://mirrors.kernel.org/mirrors/opensuse/distribution/11.0/repo/non-oss /opt/SUSE/11.0
$SYNC rsync://rsync.opensuse.org/opensuse-updates/11.0/* /opt/SUSE/11.0/update
$SYNC rsync://ftp5.gwdg.de/pub/linux/misc/packman/suse/11.0/* /opt/SUSE/11.0/packman
[/code]


The clients booted the kernel and initrd via pxe.
The exact pxelinux config stanza was:

LABEL opensuse
KERNEL linux
APPEND initrd=initrd showopts install=http://host/SUSE/11.0/oss vnc=1 vncpassword=foo


More details about my test, including vc4 error messages are here:
http://lists.opensuse.org/opensuse/2008-06/msg02365.html
It's really part of this thread but I was posting from a new machine without the prior posts to reply to:
http://lists.opensuse.org/opensuse/2008-06/msg01931.html

And at least someone else has the same problem:
http://forums.opensuse.org/install-boot-login/386983-remote-vnc-controlled-installation-fails-opensuse-11-0-a.html

I'd say there is a problem that doesn't affect every machine, and appears in a feature that not many people use, and so the reports will be a bit rare, but I think it's hard to say the problem doesn't exist. But perhaps it's just hard for me to say that because I am staring at 2 machines showing the problem.

Remember, the same feature in the 10.3 installer does not fail on the same hardware, 2 different copies of the same hardware.

Comment 32 Andrew Joakimsen 2008-07-02 02:24:06 UTC
*** Bug 339482 has been marked as a duplicate of this bug. ***
Comment 33 white brian 2008-07-02 02:47:58 UTC
How in the world do you arrive that that bug is the same as this one, just because they both have "installer" and "vnc" in the text?

From what I can see they are not even remotely related.

I didn't notice any problem with nics being recognized out of expected order and given unexpected device names. The nics worked fine and the vnc client actually does connect to the vnc server and negotiates the password and displays a screen, merely somewhat scrambled but not beyond all recognition or anything. Doesn't that kind of put us past any issues with the nic? Bridging was not enabled in the bios either, this bios doesn't even have that ability, so apps are either talking to the correct nic or they're not talking at all.

But forget all that, The other bug happens _after install_ after the the reboot and into the 2nd stage of install, which is really just config no longer install.

This bug happens _before install_.

This is stupid.

Comment 34 Andrew Joakimsen 2008-07-02 02:52:26 UTC
That was an error. Sorry I have to dig through the bugzilla again to find the mystery bug Mr. Dirsch reference in #19. It must have been a typo during marking as duplicate... hang tight.

Comment 35 Andrew Joakimsen 2008-07-02 02:59:28 UTC
*** Bug 389386 has been marked as a duplicate of this bug. ***
Comment 36 Andrew Joakimsen 2008-07-02 03:04:03 UTC
Bug # 339482 *IS NOT* a duplicate.

Bug # 389386 *IS* an ***UNRESOLVED*** duplicate.
Comment 37 Stefan Dirsch 2008-07-02 06:25:35 UTC

*** This bug has been marked as a duplicate of bug 389386 ***