|
Bugzilla – Full Text Bug Listing |
| Summary: | Unable to install openSUSE 10.3 beta2 in X mode on Xen server | ||
|---|---|---|---|
| Product: | [openSUSE] openSUSE 10.3 | Reporter: | Joe Harmon <jharmon> |
| Component: | X.Org | Assignee: | Stefan Dirsch <sndirsch> |
| Status: | VERIFIED FIXED | QA Contact: | E-mail List <xorg-maintainer-bugs> |
| Severity: | Critical | ||
| Priority: | P5 - None | CC: | aschnell, eich, ms, sndirsch |
| Version: | Beta 2 | ||
| Target Milestone: | --- | ||
| Hardware: | Other | ||
| OS: | Other | ||
| Whiteboard: | |||
| Found By: | --- | Services Priority: | |
| Business Priority: | Blocker: | --- | |
| Marketing QA Status: | --- | IT Deployment: | --- |
| Bug Depends on: | 290073 | ||
| Bug Blocks: | |||
| Attachments: |
memory information
scren shot of virt-manager yast logs xorg.0.log xorg.conf Xorg.0.log blubber-xen.patch Factory Install, fails gui install Installed 10.3 domU and no video fbdevhw.diff |
||
|
Description
Joe Harmon
2007-06-19 14:16:29 UTC
Please check that memory you reserved to xen DomU was set correctly ( cat /proc/meminfo or just top on DomU ) and attach y2logs. http://en.opensuse.org/Bugs/YaST Thanks! Created attachment 149467 [details]
memory information
It should have plenty of memory. I told xen to give it 768 meg. See xen.png screen shot as well.
Created attachment 149468 [details]
scren shot of virt-manager
Created attachment 149469 [details]
yast logs
Marking back as assigned. Marcus, could you please check whether it's a problem with starting X. the y2start.log file said:
|-- TestX: XOpenDisplay failed
|-- UI_args: Running in fullscreen mode
|-- X-Server couldn't be started, falling back to ncurses
I assume the standard fbdev driver from 10.3 doesn't work in your
current Xen framebuffer. The X11 log file /var/log/Xorg.0.log
at this first stage of the installation can tell you more
Anyway this one is related to the YaST2 startup scripts which I'm
no longer responsible for. Assigning to new maintainer
As Marcus said, please provide the file /var/log/Xorg.0.log so we can see why the X server does not start. Created attachment 149889 [details]
xorg.0.log
Thanks. Stefan, can you say what the problem with the X-Server is? (==) Depth 24 pixmap format is 32 bpp (EE) FBDEV(0): FBIOPUT_VSCREENINFO succeeded but modified mode (EE) FBDEV(0): mode initialization failed Looks like a broken kernel framebuffer. How does the xorg.conf during installation look like? Joe, can you also attach the xorg.conf used during installation. The file can also be found in the installed system as /etc/X11/xorg.conf.install. Created attachment 149912 [details]
xorg.conf
> #@DefaultDepth@
This looks strange to me. IIRC this should be
DefaultDepth 16
instead.
did that ever work? @DefaultDepth@ is only replaced if /sys/class/graphics/fb0/name comes from a native nvidia, radeon or r128 kernel fbdev driver. See the comments in the xorg.conf template. I dont know if isax or xmigrate.pl is ever called in the inst-sys. I don't know. I found discussion in the bugreport mentioned in xorg.conf pretty confusing in the end. :-( I think the best would be to have access to this machine to figure out if this is a configuration issue at all. /proc/fb and /sys/class/graphics/fb0/name contains xen fbset reports 800x600 depth 32 add 'start_shell' to the install kernel cmdline and we can work it out. Created attachment 149928 [details]
Xorg.0.log
I can force DefaultDepth 16 by faking /sys/class/graphics/fb0/name
But the kernel fbdev driver can not change the depth apparently.
can you check a 10.2 or sles10 install, if it really can switch from depth 32bit to something else?
I remember the same from PS3. Pre-alpha4 inst-sys (and running system) was unable to handle depth 24 as shown in comment #11. In alpha4 it magically worked again. I have not tried it again since then. (In reply to comment #20 from Olaf Hering) > can you check a 10.2 or sles10 install, if it really can switch from depth > 32bit to something else? How do I do that during an install? I am so confused. This has nothing to do with the host machine. It has everything to do with the guest machine. The guest machine is not installed yet. I start the install and receive the error. It is really easy to duplicate and I have give you an identical machine for duplicating. Joe, just check /etc/X11/xorg.conf and /var/log/Xorg.0.log on a 10.2 inst-sys. Let yast start, switch to console 2 and check Depth values like that: grep DefaultDepth /etc/X11/xorg.conf grep -E '(FBDEV|Depth)' /var/log/Xorg.0.log On this same test box I have started an installation of SLES 10 SP1 x86_64 as well as openSUSE 10.2 x86_64. For some reason the openSUSE 10.2 installation will not allow me to switch TTY's so I am unable to get the information on that machine. However I am able to switch on the SLES 10 SP1 build and I get the following results. (1) grep DefaultDepth /etc/X11/xorg.conf no results came back (2) grep -E '(FBDEV|Depth)' /var/log/Xorg.0.log (II) FBDEV: driver for framebuffer: fbdev, afb (II) FBDEV(0): using default device (==) FBDEV(0): Depth 24, (==) framebuffer bpp 32 (==) FBDEV(0): RGB weight 888 (==) FBDEV(0): Default visual is TrueColor (==) FBDEV(0): Using gamma correction (1.0, 1.0, 1.0) (II) FBDEV(0): hardware: xen (video memory: 1875kB) (**) FBDEV(0): Option "ShadowFB" "off" (II) FBDEV(0): checking modes against framebuffer device... (II) FBDEV(0): mode "default" not found (II) FBDEV(0): checking modes against monitor... (--) FBDEV(0): Virtual size is 800x600 (pitch 800) (**) FBDEV(0): Built-in mode "current": 28000.0 MHz, 35000.0 kHz, 58333.3 Hz (II) FBDEV(0): Modeline "current" 28000.00 800 800 800 800 600 600 600 600 -hsync -vsync -csync (==) FBDEV(0): DPI set to (75, 75) (--) Depth 24 pixmap format is 32 bpp (EE) FBDEV(0): FBIOBLANK: Invalid argument (==) FBDEV(0): Backing store disabled (EE) FBDEV(0): FBIOBLANK: Invalid argument ok, this means kernel runs in 32bit, X can cope with it. Can you install a 10.3 and enable ssh access to the installed system? I think its a kernel bug, as Stefan mentioned above. So are you saying that even though I installing all of my servers as 64 bit that they are running in 32 bit mode? Olaf, please investigate. When you can't get any further, reassign back to me. I can understand that Joe is confused when we both let him investigate different confusing things. Created attachment 150011 [details] blubber-xen.patch This is about color depth, not about cpu types. test patch for FBIOPUT_VSCREENINFO rpms can be found in http://boettger.suse.de/inst/olh/bug285523/ (In reply to comment #31 from Olaf Hering) > This is about color depth, not about cpu types. Yeah, I don't know why I thought otherwise, I think that I was just caught up in to many things yesterday. > test patch for FBIOPUT_VSCREENINFO > rpms can be found in > http://boettger.suse.de/inst/olh/bug285523/ How do I enable this patch so that it can be used during the install? you cant. just do a ncurses install, X will fail also in the running system. upgrade to this kernel, and it will print some blurb to dmesg. Looks like we have a bigger problem than this issue, in that openSUSE 10.3 won't even fully install on Xen. I will try the alpha5plus nm, I just found out from Andreas that they didn't build an x86_64 version of alpha5plus because it was so broken. can you install an alpha5 on that system? (In reply to comment #36 from Olaf Hering) > can you install an alpha5 on that system? No, same issue with alpha5. (In reply to comment #37 from Joe Harmon) > No, same issue with alpha5. In fact I just put in https://bugzilla.novell.com/show_bug.cgi?id=290073 for not being able to install because of an initrd issue. can you install a 10.2? that would help as well with the ioctl debugging. (In reply to comment #39 from Olaf Hering) > can you install a 10.2? that would help as well with the ioctl debugging. Okay, it is installed on the box. how do I get the graphical console with the opensuse102 image? (In reply to comment #41 from Olaf Hering) > how do I get the graphical console with the opensuse102 image? After you start the VM you have to go to View, Serial Console. I have opened one up on that machine. However X is not running, so I hope that you are able to get what you need from the serial console. there is no framebuffer in that kernel. maybe it is a feature of a later kernel version. after installing my test kernel, the system hangs at mounting the root filesystem. is alpha6 installable already? This bug is a duplicate of bug #296679 or the other way around. That bug has been assigned to the same guy that fixed it for sles. Reassigned to Xen responsible person. should this just be closed out/marked as a duplicate since it doubles bug #296679 that and that bug is assigned to the people that fixed the same problem in sles IMO no.... If you want to close one as a duplicate then it should be 296679 is a duplicate of this one because it was opened after this one and this bug has all of the debugging information in it. It still doesn't work for me on openSUSE 10.3 alpha 7 x86_64. *** Bug 296679 has been marked as a duplicate of this bug. *** This appears to be fixed on 32 bit domUs. 64 is currently broken, supposedly because of the switch to squashfs. Sorry I was confused, ignore comment #58. This is still broken for 32 bit. As of factory. This is the error in my Xorg log file during the install: (EE) FBDEV(0): FBIOPUT_VSCREENINFO succeeded but modified mode (EE) FBDEV(0): mode initialization failed Fatal server error: AddScreen/ScreenInit failed for driver 0 I'm going to assume that 64 is still broken too Created attachment 158336 [details]
Factory Install, fails gui install
Created attachment 158542 [details]
Installed 10.3 domU and no video
I finally got opensuse 10.3 factory install as a domU. There is no graphically interface. I've attached the SaX.log file. fbset is:
mode "800x600"
geometry 800 600 800 600 32
timings 0 0 0 0 0 0 0
rgba 8/16,8/8,8/0,0/0
endmode
Still not working in openSUSE 10.3 beta 2 Bumping up severity as we cannot run X at all. Pat, would you please make it a priority to get to the bottom of this bug. If it is a Xen issue and you can fix it, please do; if it is a bug outside of Xen, articulate the problem and reassign so it can get fixed. Thanks. As an install work around you can do a Remote SSH Installation by adding "ssh=1" to the boot parameters. In virt-manager when setting up the VM, at the Operating System Intallation panel add "ssh=1" to the Additional Arguments text field. > (EE) FBDEV(0): FBIOPUT_VSCREENINFO succeeded but modified mode
> (EE) FBDEV(0): mode initialization failed
hw/xfree86/fbdevhw/fbdevhw.c:fbdevHWSetMode()
if (!fbdev_modes_equal(&set_var, &req_var)) {
if (!check)
xf86DrvMsg(pScrn->scrnIndex, X_ERROR,
"FBIOPUT_VSCREENINFO succeeded but modified "
"mode\n");
#if DEBUG
print_fbdev_mode("returned", &set_var);
#endif
return FALSE;
}
src/fbdev.c:FBDevScreenInit()
[...]
if (!fbdevHWModeInit(pScrn, pScrn->currentMode)) {
xf86DrvMsg(scrnIndex,X_ERROR,"mode initialization failed\n");
return FALSE;
}
fbdev_modes_equal(...) in fbdevHWSetMode() fails because pixclock is different. (gdb) p set_var->pixclock $6 = 0 (gdb) p req_var->pixclock $7 = 35 Therefore fbdevHWModeInit() fails, and therefore you end up with a failure in FBDevScreenInit. mode->Clock is 28000000 (28 MHz sounds reasonable to me). Therefire req_var->pixclock gets set to 35 in xfree2fbdev_timing()
xfree2fbdev_timing()
[...]
var->pixclock = mode->Clock ? 1000000000/mode->Clock : 0;
set_var gets initialized with the same values, but gets changed afterwards
via the ioctl.
if (0 != ioctl(fPtr->fd, FBIOPUT_VSCREENINFO, (void*)(&set_var))) {
xf86DrvMsg(pScrn->scrnIndex, X_ERROR,
"FBIOPUT_VSCREENINFO: %s\n", strerror(errno));
return FALSE;
}
(gdb) p set_var->pixclock
$13 = 0
Broken ioctl?
This seems to be something deep in the kernel, probably fb_set_var(). I'm not familiar with debugging kernel code. Reassigning to kernel guys. I mean kernel Xen guys, of course. :-) > This seems to be something deep in the kernel, probably fb_set_var().
I should have looked at Olaf's patch before. But since we come to the same conclusion independantly it shouldn't be that wrong. :-)
Looking at fb_set_var(), it is clear that pixclock cannot bet set to anything (and really this field, by its name, is meaningless for a virtual frame buffer): if info->fbops->fb_check_var is not set (as is the case for xenfb), the user-mode supplied var is simply overwritten with the current settings. I can't see anything wrong in the behavior of Xen here - Stefan, why do you think that fb_set_var() must actually do *any* change to the current settings, if the low-level driver knows the settings aren't meaningful (currently the assumption in XFree86 appears to be that changes to the requested mode settings are done by the kernel only if the low-level driver considers the settings wrong)? set_var->pixclock is set to 0 via the ioctl. I tried to explain this in my comments #68-70. I'm not sure if it's fb_set_var. I didn't debug the kernel. Did you? That is what I said - whatever user mode provides as input will be overwritten with the current settings kernel mode knows of if the low level driver doesn't specify an fb_check_var handler. Since the pixclock field (as being meaningless) never gets set by xenfb (and the generic code cannot possibly set it to any meaningful value), you see it coming back as zero. My question was why you think this isn't correct. fbdev_modes_equal(&set_var, &req_var) thinks that each struct member (including pixclock) need to be the same. So the difference between a Xen and a non-Xen kernel is that no fb_check_var handler is defined for Xen and so the pixclock field is set to a different value of 0 (by default). So * either we should no longer compare pixclock in fbdev_modes_equal() (or ignore a different value if it's 0 - assuming a Xen kernel in this case), because it's meaningless anyway (I'm not sure if it's really is) * or make sure to return back the same pixclock value when using the Xen kernel - which is not possible if I understand you correctly because no fb_check_var handler is defined I'm still wondering why this problem pops up now. I'm pretty sure I can rule out any changes in X.Org in this area. So something must have been changed for openSUSE 10.3 in the (Xen) kernel. 10.2 doesn't have a xenfb kernel component, but SLE10's fb_set_var() definitely behaves the same. I can only assume X deals with the situation differently, and am therefore tempted to assign this back... Please clarify how SLE10's X handles this condition. Regardless of the result, I think that the first of the two options you presented is the way to go. > Please clarify how SLE10's X handles this condition.
I'll double check this.
Just for the record.
non-Xen kernel
--------------
# fbset
mode "800x600-75"
# D: 48.001 MHz, H: 46.876 kHz, V: 75.121 Hz
geometry 800 600 800 600 32
timings 20833 96 32 16 4 96 4
rgba 8/16,8/8,8/0,8/24
endmode
(--) FBDEV(0): Virtual size is 800x600 (pitch 800)
(**) FBDEV(0): Built-in mode "current": 48.0 MHz, 46.9 kHz, 75.1 Hz
(II) FBDEV(0): Modeline "current" 48.00 800 832 928 1024 600 604 608 624 -hsync -vsync -csync
(==) FBDEV(0): DPI set to (75, 75)
Xen kernel
----------
# fbset
mode "800x600"
geometry 800 600 800 600 32
timings 0 0 0 0 0 0 0
rgba 8/16,8/8,8/0,0/0
endmode
(--) FBDEV(0): Virtual size is 800x600 (pitch 800)
(**) FBDEV(0): Built-in mode "current": 28000.0 MHz, 35000.0 kHz, 58333.3 Hz
(II) FBDEV(0): Modeline "current" 28000.00 800 800 800 800 600 600 600 600 -h
sync -vsync -csync
(==) FBDEV(0): DPI set to (75, 75)
> > Please clarify how SLE10's X handles this condition. > I'll double check this. I was proven wrong and apologize. A lot of things changed in fbdevhw.c between SLES10 (X.Org 6.9) and openSUSE 10.3 (xorg-server 1.3.0.0). Code was reorganized, new checks have been added (among these is fbdev_modes_equal). commit f6815cb68b0f6698497348fc6e4214dacef33b95 Author: Michel Dänzer <michel@tungstengraphics.com> Date: Sat Dec 30 10:18:28 2006 +0100 fbdevhw: Consolidate modeset ioctl calling, report failure if it modifies mode. The fbdev API allows the driver to 'accept' modes it doesn't really support by modifying it to the nearest supported mode. Without this check, e.g. vesafb would appear to accept all modes, even though it actually can't set any modes other than the bootup mode at all. ==> X.Org bug Created attachment 161573 [details]
fbdevhw.diff
Patch. At least Xserver starts now.
Pat. I updated /usr/lib/xorg/modules/linux/libfbdevhw.so with the above patch applied. Could you check if the Xserver works now? This sounds still strange to me.
(**) FBDEV(0): Built-in mode "current": 28000.0 MHz, 35000.0 kHz, 58333.3 Hz
(II) FBDEV(0): Modeline "current" 28000.00 800 800 800 800 600 600 600 600
-hsync -vsync -csync
> (**) FBDEV(0): Built-in mode "current": 28000.0 MHz, 35000.0 kHz, 58333.3 Hz
> (II) FBDEV(0): Modeline "current" 28000.00 800 800 800 800 600 600 600 600
> -hsync -vsync -csync
I wonder whether this is a typo in fbdevhw.c:fbdev2xfree_timing()
[...]
mode->Clock = var->pixclock ? 1000000000/var->pixclock : 28000000;
[...]
It looks like it needs to be set in kHz and not in Hz. This code hasn't changed, so this would mean it was broken from the beginning ...
Would be good to see a SLES10 Xen-kernel fbdev Xserver logfile to verify this.
Pat, see comment #83. machine as mentioned in comment #67. Xserver is now working. I tested it from runlevel 3 startx and runlevel 5 gdm login. This is definitely great news! Thanks for testing. xorg-x11-server package with patch submitted for STABLE. Unfortunately it's to late for Beta3. :-( The proposed fix is fine. If the kernel doesn't report a valid pixel clock we accept any (ie. the default one chosen by the server). The 'default' setting mentioned in comment #84 is definitely broken which shouldn't matter. However if a default setting is chosen here the validation code should also take this into account. Otherwise the 'default' setting will cause this mode to be culled. Ok. So instead of checking for set->pixclock == 0 we could also check for the default value req->pixlock == 35000 (after fixing the typo in fbdev2xfree_timing). Again. http://download.opensuse.org/repositories/xorg73/openSUSE_Factory/ Why is it that difficult to download a file ??? It will be fixed with Beta4. Bugs are closed as fixed once the fix has been submitted. This does not necessarily mean that the fix is available at the same time in FACTORY. There can be a delay of days or even weeks before it gets checked in, built and finally pushed to FACTORY. It seemed you wanted to test it ASAP so I mentioned the buildservice, which I use to provide the current X.Org packages as well. Usually fixes there are available nearly at the same as I close the bugreport. I added the RPM changelog, since this is the best method to verify that the fix is really included with the package. Hope this helps. Thanks for the answer. I really appreciate it. I've more or less been coming up to speed on factory. I've been a user for some time, but never on the development side. It's been a little hard on the bugzilla side as many have thought that I know exactly what is wanted/needed. There is a different methodology here than I've seen at previous jobs. As far as the product that I work on, it ends up directly in factory and not a build service. So, again thanks for explaining that. This is great! Marking as verified |