|
Bugzilla – Full Text Bug Listing |
| Summary: | /etc/init.d/kbd: setfont breaks first Xserver start | ||
|---|---|---|---|
| Product: | [openSUSE] openSUSE 11.0 | Reporter: | Stefan Dirsch <sndirsch> |
| Component: | Kernel | Assignee: | Juergen Weigert <jw> |
| Status: | RESOLVED FIXED | QA Contact: | E-mail List <qa-bugs> |
| Severity: | Critical | ||
| Priority: | P5 - None | CC: | aj, coolo, cybernerd94, dmueller, eich, jeffm, nvbugs, rodw, roger.larsson, sreeves, sshaw, werner |
| Version: | Alpha 2 | ||
| Target Milestone: | --- | ||
| Hardware: | Other | ||
| OS: | Other | ||
| Whiteboard: | |||
| Found By: | --- | Services Priority: | |
| Business Priority: | Blocker: | --- | |
| Marketing QA Status: | --- | IT Deployment: | --- |
| Attachments: |
kbd-1.12-132.i586.rpm
kbd-1.12-132.x86_64.rpm patch 001 patch 002 patch 003 new patch for 002 |
||
|
Description
Stefan Dirsch
2007-08-21 00:35:19 UTC
*** Bug 299623 has been marked as a duplicate of this bug. *** *** Bug 297441 has been marked as a duplicate of this bug. *** The problem only occurs when vga text console is active which is a much less often checked code path as most people today use vesafb. Appearantly setfont performs some hardware access despite of the fact that the console is in KD_GRAPHICS mode. The described fix is really a workaround that avoids the real problem. This issue should be looked at. Please reassign it to me with RESOLVED LATER when done. *** Bug 298799 has been marked as a duplicate of this bug. *** Reassigning to Coolo. earlyxdm is part of preload package. Dirk, can you please take care of this bug? hmm. I've looked a bit at the setfont code and the kernel code. I believe that dropping the kbd patch "be-nice-to-kdm" for setfont and replacing it with a return -1 in line 269 would fix the issue. one part I don't understand in the kernel is that the PIO_FONT calls don't always work on the vty that was passed in, but most of the time on the current foreground vty. however, sometimes they operate on the one that was passed in, and I guess here a check is missing if this is the one in foreground, and otherwise not trash the graphics card. to me, its a kernel bug. this was already noticed in 1998: http://www.ussg.iu.edu/hypermail/linux/kernel/9808.0/0279.html so not calling the "compatibility crap" from setfont might be a good idea. Compatibility crap removed from setfont. No more fighting against xdm with delay loops. We are faster now, so earlykbd is now again the way to go. Submitted to stable. Stefan, please test and close. Created attachment 158818 [details]
kbd-1.12-132.i586.rpm
Created attachment 158819 [details]
kbd-1.12-132.x86_64.rpm
Magnus, please test again with disabled framebuffer and with the updated kbd RPM installed. Also add kbd to Required-Start of earlyxdm. thats not gonna fix it. I was just able to reproduce this issue on my ATI based laptop. and given some patience, I even figured out the kernel code being responsible for it. Clearing needinfo. Awaiting further information as per Comment#14 Created attachment 158945 [details]
patch 001
Created attachment 158947 [details]
patch 002
Created attachment 158948 [details]
patch 003
I'm not really sure if patch 2 and 3 are correct. On fbdev consoles the fonts can be set per console and the font data is not kept in hardware. So there is no reason to restrict setting fonts to be set only on the visible console.
In fact the patch may break our setfont stuff as it will only set the correct font on the visible console.
That the font data is kept in hardware (and only there) and can only be set for all consoles at once is a property of the VGA console.
This this problem should there be fixed there, which means the VGA console should bail out if someone tries to write a font while the console is in KD_GRAPHICS mode.
According to the other comments here I assume (but have not checked) that only the console that is opened by the Xserver is actually put into KD_GRAPHICS.
Therefore we need to check all consoles in vgacon if one of them is this mode and bail if this is the case.
This tiny (completely untested patch) should do this job:
--- a/drivers/video/console/vgacon.c
+++ b/drivers/video/console/vgacon.c
@@ -1032,6 +1032,11 @@ static int vgacon_do_font_op(struct vgas
unsigned short video_port_status = vga_video_port_reg + 6;
int font_select = 0x00, beg, i;
char *charmap;
+
+ for (i = 0; i < MAX_NR_CONSOLES; i++) {
+ if (vc_cons[i].d->vc_mode == KD_GRAPHICS)
+ return -EINVAL;
+ }
if (vga_video_type != VIDEO_TYPE_EGAM) {
charmap = (char *) VGA_MAP_MEM(colourmap, 0);
This also means that we need to make rcxdm run *after* setfonts() has run.
(The only way to save/restore fonts on vga console is to read them out to user space and restore them from there).
Created attachment 159682 [details]
new patch for 002
I agree, patch 3 is not really correct. I've updated patch 2 to use vga_is_gfx instead, which should fix the bug. Dirk, to clarify, your latest attachment is the fix? the latest attachment is the minimally necessary fix. patch 001 is useful as well, though on factory we patched the setfont utility to no longer activate this code path. Note that there is a another duplicate of this bug tracking the very same bug for SLE10. The second patch prevents setfont from accessing vga registers on the card when the card is in graphics mode KD_GRAPHICS as we assume, that someone else (ie. the Xserver) is in charge of the HW in which case accessing the vga registers may (at best) have no effect (not even the desired one) or (at worst) interfer with settings the graphics driver has made. I'm going to revert the work around now that we have a fix. So dear kernel maintainers: please take care. For the worried: we decided not to rush it into beta3 (in case beta3 delays, we'll put it in, if not not) but make an online update. Also to test kernel updates Patch has been added to our kernel CVS for STABLE. *** Bug 308769 has been marked as a duplicate of this bug. *** http://en.opensuse.org/Bugs:Most_Annoying_Bugs_10.3_dev * setfont breaks first Xserver start (Bug #302010) -> online update pending no longer pending - it's released. Thanks. Scott, please test if the kernel update really fixes the problem. *** Bug 309015 has been marked as a duplicate of this bug. *** I have applied the kernel update. Seemed to be fixed. But: Have you tried switch to init 3 and than back to init 5? The graphical output get distorted then again. don't hijack bugs. init5 -> init 3 -> init 5 is hardly the first X server start I'm not sure, Coolo. It could be the same issue. Ladislav, is the issue still reproducable after adding kbd as Required start to earlyxdm or enabling the kernel framebuffer (I'm asuming it is disabled)? Is the graphical output fixed after pressing Ctrl-Alt-BS (starts a new Xserver)? (In reply to comment #35 from Stefan Dirsch) > Ladislav, is the issue still reproducable after adding kbd as Required start to earlyxdm After that X server is broken right after booting. >or enabling the > kernel framebuffer (I'm asuming it is disabled)? How can I enable framebuffer on old PCI Matrox card? Maybe someone should try to reproduce this on different machine with more recent graphics card. > Is the graphical output fixed > after pressing Ctrl-Alt-BS (starts a new Xserver)? Yes. > How can I enable framebuffer on old PCI Matrox card?
Generic kernel framebuffer! Add vga=0x317 to each "kernel ..." line in /boot/grub/menu.lst. Even an old Matrox PCI card supports generic kernel framebuffer.
I think this problem has nothing to do with this particular bug. please open a new bugreport. After reboot it's everything O.K. also with framebuffer. I can't reproduce it even on my workstation. I think it's fixed. Closing. Sorry for the noise. *** Bug 298795 has been marked as a duplicate of this bug. *** > also with framebuffer
The issue does not occur with framebuffer. If it went away by enabling the framebuffer it is rather likely that the issue still exists. :-(
(In reply to comment #41 from Stefan Dirsch) > > also with framebuffer > The issue does not occur with framebuffer. If it went away by enabling the > framebuffer it is rather likely that the issue still exists. :-( > It stop occurring at all. It's probably hw problem. I have more strange problems on that machine so I think this needs proper testing on more computers with various hw constellations. Ok. So it's more a coincidence. I've been having issues with X on my 10.3rc1 64 bit machine and all bugs I find matching my issues refer to this bug. Was this supposed to have been fixed in rc1 or is this still a work in progress? *** Bug 331624 has been marked as a duplicate of this bug. *** Given the amount of bugs we got for this issue since releasing 10.3 I no longer think this Bug can be considered fixed. :-(
Workaround to fix race condition between kbd and earlyxdm
---------------------------------------------------------
1) add kbd to "Should-Start:" line of /etc/init.d/earlyxdm
2) run insserv
3) reboot the machine
If this still doesn't help, add 'sleep <seconds>' right before
'exec /etc/init.d/xdm ${1+"$@"}' in /etc/init.d/earlyxdm. Reboot.
IMHO we should make 1) the default for 11.0 and even provide an updated preload RPM for 10.3. I think that making sure that our customers get a proper keyboard after booting is more important than gaining maybe once second in the boot process.
*** Bug 309484 has been marked as a duplicate of this bug. *** *** Bug 331528 has been marked as a duplicate of this bug. *** Stephan, is there a way to find out from the system configuration if the problem will hit? I did not read all comments in all duplicates, but it seems it's frame buffer specific and we can disable earlyxdm only on those for 10.3 (In reply to comment #50 from Stephan Kulow) > Stephan, is there a way to find out from the system configuration if the > problem will hit? I did not read all comments in all duplicates, but it seems > it's frame buffer specific and we can disable earlyxdm only on those for 10.3 I'm not sure if it only happens with enabled framebuffer. I'm not aware of any confirmation of the affected persons in this bugreport that the issue has been fixed with the kernel patch when framebuffer is disabled. Magnus, Stephen (Shaw), Ladislav, Scott, Julian? I'm not aware of any way to find out from the system configuration if the problem will hit. I'm only aware of a way to prevent this issue. ;-) In the case of bug #331528, the discriminating factor appears to be whether KBD_TTY in /etc/sysconfig/keyboard contains "tty1...tty6" or "tty1...tty20". The latter is problematic, while the former isn't. I can't speak for the other duplicates, but from the similarity of the reports, I think it is very likely that the same applies. Stefan: if you add kbd to earlyxdm you pretty much erase the difference between earlyxdm and xdm, because kbd is started very late in the process (which is also the reason for the trouble :).
How can tty20 end up in KBD_TTY?
From init.d/kbd:
if test -z "$KBD_TTY"; then
# >=tty7 left out intentionaly
KBD_TTY="tty1 tty2 tty3 tty4 tty5 tty6"
fi
So I guess the fix is to filter out tty7 even if it's in sysconfig. JW, that would be your work around then :)
Regarding to bug #309484: I did kernel update with YOU, I don't have framebuffer and I still cant see correct special czech characters displayed. Instead "?" I see strange font. But no other problems occur. Well, I just don't believe that this issue only occurs on systems with tty7 and higher in KBD_TTY. Ladislav, how does your KBD_TTY line look like? file /etc/init.d/kbd; from line 85:
# Calculate KBD_TTY array only once
# Caution: Keep in sync with earlykbd.init
#
if test -z "$KBD_TTY"; then
# >=tty7 left out intentionaly
KBD_TTY="tty1 tty2 tty3 tty4 tty5 tty6"
fi
newkbd=""
for tty in $KBD_TTY; do
test -w /dev/$tty || continue
test -c /dev/$tty || continue
> /dev/$tty &> /dev/null || continue
newkbd="${newkbd:+$newkbd }/dev/$tty"
done
KBD_TTY="$newkbd"
unset newkbd
So checking for tty7 wouldn't help at all. Regarding comment #54: ttys 7-20 ended up in /etc/sysconfig/kbd because suse put them there at some point. Just as with the other reporter who had the extra ttys, this system has been upgraded many times, so I've no idea at what point that was. They stay in KBD_TTY because nothing in /etc/init.d/kbd removes them! Regarding comments #58-59: KBD_TTY is only set in /etc/init.d/kbd if it is empty or unset at that point, after sourcing /etc/sysconfig/keyboard. In other words, it just provides a default. Certainly, in the case that it's tty{1..20} the conditional is dead code and the filter does nothing. If setfont should really not be called on ttys > 6, then /etc/init.d/kbd should actually filter the list based on device name, not just on whether it is a writable character device. If I understand the comments above, Mr. Michnovic no longer has any problems logging in, but only with display of Czech characters; this sounds like a very different bug from the setfont/kdm login bugs that have been marked as duplicates of this one...whatever the case, my own setfont problems were strictly limited to the extra ttys being setfont'd after earlyxdm, and this was the case for at least some other submitters. Ok. Let's begin to make sure to no longer use setfont for tty7 and higher. Mike is another victim. magellan:/etc/sysconfig/keyboard KBD_TTY="tty1 tty2 tty3 tty4 tty5 tty6 tty7 tty8 tty9 tty10 tty12 tty13 tty14 tty15 tty16 tty17 tty18 tty19 tty20" Mike, could you remove anything higher than tty6? This should help in the future. Thank you, I've removed the entries higher than tty6 now. SUSE LINUX 10.1 / SLES10 was already limited to the range of tty1-tty6. So this changed somewhere between SLES9 (SUSE LINUX 9.1) and SLES10 from range of tty1-tty20 to range of tty1-tty6. So only users, who are updating since at least SuSE Linux 10.0 or SLES9 can/will be affected by this issue. ... -> 10.0 -> 10.1 -> 10.2 -> 10.3 -> ... ... -> SLES9 -> SLES10 -> SLES11 -> ... I'm mostly concerned about SLES customers here. This one is much more easier. :-) Thanks, Jürgen! lsof /dev/tty[0-9]* | grep X | sed -e 's@.* @@' With this /usr/bin/lsof should become /bin/lsof ..
and lsof can be used within other programs with e.g.
lsof -nF cpn -- /dev/tty[0-9]*
which list with identifiers the process command name, the
process ID, and the file name.
An other option would be to use fuser
fuser -n file /dev/tty[0-9] /dev/tty[0-9][0-9] 2>&1
but note that the pids are on stdout whereas the names will
be printed on stderr (therefore I've used the redirect).
And the pid has to be checked with ps or somehow else.
At last but not least the normal ps command could be used:
. /etc/sysconfig/keyboard
ps -o tty=,comm= t ${KBD_TTY// /,}
tty4 mingetty
tty5 mingetty
tty6 mingetty
tty2 mingetty
tty3 mingetty
tty7 X
tty1 mingetty
with tty it looks like
ps -o pid=,comm= t tty1,tty2,tty3,tty4,tty5,tty6,tty6,tty7
tty4 mingetty
tty5 mingetty
tty6 mingetty
tty2 mingetty
tty3 mingetty
tty7 X
tty1 mingetty
Well, thinking about it again. It's still a race. So checking for X using some tty won't help us in the end. :-( Better just not using any tty > 6. *** Bug 360373 has been marked as a duplicate of this bug. *** (In reply to comment #68 from Stefan Dirsch) > Well, thinking about it again. It's still a race. So checking for X using > some tty won't help us in the end. :-( Better just not using any tty > 6. Fixed for STABLE/Factory and openSUSE 11.0 Beta1. |