|
Bugzilla – Full Text Bug Listing |
| Summary: | vmmouse_detect seg faults at vmmouse_proto.c:62 | ||
|---|---|---|---|
| Product: | [openSUSE] openSUSE 11.3 | Reporter: | Leonardo Chiquitto <lchiquitto> |
| Component: | X.Org | Assignee: | Stefan Dirsch <sndirsch> |
| Status: | RESOLVED FIXED | QA Contact: | E-mail List <xorg-maintainer-bugs> |
| Severity: | Normal | ||
| Priority: | P3 - Medium | CC: | agraf, aj, michel |
| Version: | Factory | ||
| Target Milestone: | --- | ||
| Hardware: | x86-64 | ||
| OS: | openSUSE 11.3 | ||
| Whiteboard: | |||
| Found By: | --- | Services Priority: | |
| Business Priority: | Blocker: | --- | |
| Marketing QA Status: | --- | IT Deployment: | --- |
| Attachments: |
vmmouse_detect.core
suggested patch |
||
Created attachment 361512 [details]
vmmouse_detect.core
But the vmmouse driver is running? I don't think it is (this is a physical machine and I couldn't find references on boot or Xorg logs). Anyway, how can I check if this driver is active? Look for vmmouse in /var/log/Xorg.0.log (in the guest system). This is a physical machine with no guests running. Also, there are no references to vmmouse in /var/log/Xorg*. It seems udev [1] is running vmmouse_detect on every boot when it detects an i8042 AUX port. [1] /lib/udev/rules.d/69-xorg-vmmouse.rules The segfaulted is expected and handled accordingly.
xf86-input-vmmouse/tools/vmmouse_detect.c:
[...]
void
segvCB(int sig)
{
#if defined HAVE_XORG_SERVER_1_1_0
exit(1);
#endif
}
int
main(void)
{
/*
* If the vmmouse test is not run in a VMware virtual machine, it
* will segfault instead of successfully accessing the port.
*/
signal(SIGSEGV, segvCB);
[...]
*** Bug 610418 has been marked as a duplicate of this bug. *** It should either not segfault or it should not be run by default. Currently this is just broken, you get core dumps or segfaults at every system start. AFAICS if we wouldn't run it the vmmouse driver wouldn't be used, since no vmmouse device wouldn't be available (see /lib/udev/rules.d/69-xorg-vmmouse.rules). If this the way to go, please let me know. Is there a different way to figure out if we're in a VMware virtual machine? I don't know. I guess there is a reason why VMWARE implemented the detection that way. Michel (@VMWARE), any hints would be appreciated here. Is Philip Langdale, the original author of vmouse_detect, still working at VMWARE? Could we contact him directly? Created attachment 366092 [details]
suggested patch
Although I can't explain why, this patch resolves the problem for me.
(In reply to comment #11) > Created an attachment (id=366092) [details] > suggested patch > > Although I can't explain why, this patch resolves the problem for me. Subject: Not calling iopl() is triggering an undesired SEGV References: bnc#604966 Reverts the following upstream commit: commit bcdec3d0cd4434770cd841c33c030e0d7203881f Author: Philip Langdale <philipl@fido2.homeip.net> Date: Thu Oct 23 23:35:28 2008 -0700 Remove call to iopl(). It's not portable and isn't necessary. diff --git a/tools/vmmouse_detect.c b/tools/vmmouse_detect.c index e5f14a3..0dd4827 100644 --- a/tools/vmmouse_detect.c +++ b/tools/vmmouse_detect.c @@ -47,6 +47,11 @@ main(void) signal(SIGSEGV, segvCB); #if defined __i386__ || defined __x86_64__ + /* + * To access i/o ports above 0x3ff, we need to be in iopl(3). + */ + + iopl(3); if (VMMouseClient_Enable()) { VMMouseClient_Disable(); return 0; And the upstream commit was: commit bcdec3d0cd4434770cd841c33c030e0d7203881f Author: Philip Langdale <philipl@fido2.homeip.net> Date: Thu Oct 23 23:35:28 2008 -0700 Remove call to iopl(). It's not portable and isn't necessary. diff --git a/tools/vmmouse_detect.c b/tools/vmmouse_detect.c index e5f14a3..0dd4827 100644 --- a/tools/vmmouse_detect.c +++ b/tools/vmmouse_detect.c @@ -47,11 +47,6 @@ main(void) signal(SIGSEGV, segvCB); #if defined __i386__ || defined __x86_64__ - /* - * To access i/o ports above 0x3ff, we need to be in iopl(3). - */ - - iopl(3); if (VMMouseClient_Enable()) { VMMouseClient_Disable(); return 0; Test packages available in: http://download.opensuse.org/repositories/home:/leonardocf:/branches:/X11:/XOrg/openSUSE_Factory/ Leonardo, if you fix the Sig11 on a non-VWware virtual machine, the program behaves the same as in a VWware virtual machine. But the idea was to segfault and then exit(1) instead of exit(0). So now always the vmmouse ID_INPUT tag is added to udev, which results in using the vmmouse driver on all systems. This does not make sense to me. Or I did not understand the basics here ... Philip doesn't have an account here but can be reached at plangdale at vmware dot com. Here's his initial response:
> Anyway, the issue here was slightly convoluted. The standard detection
> mechanism we use has always been to do this port-poke, and if it's
> not a VM, you get a segfault - and you do need iopl() set to allow that
> to work. When I published the vmmouse source, I had the iopl() call in
> there and that was an issue for non-Linux operating systems, but I also
> observed that it wasn't really necessary because the X server did iopl()
> itself - not really a surprise. Later on, I added vmmouse_detect to allow
> the HAL/udev based device detection to work, and those are standalone and
> so what the X server does is irrelevant.
>
> Now, why isn't this an issue for anyone else? It does the detection
> correctly on Ubuntu and other distros (and no one cares about the segfault
> in the failure case). I guess they have funny core handling going on?
>
> Anyway, it seems the right fix is to add the iopl() call in, perhaps only
> in vmmouse_detect as it's still irrelevant in the X server. It also doesn't
> matter in the X server now as the driver isn't loaded unless the device is
> detected.
>
> It also needs to be properly guarded for LINUX. Who knows what's up on BSD
> or similar.
Thanks a lot for the input, Michel. Greetings to Philip. I believe that helped a lot. No need to contact him directly. Also he should have read permissions to this bugreport. Personally I don't care about a segfault to detect a VMWARE virtual machine. Apparently Andreas Jaeger does. I don't know why. What I still don't understand. If iopl(3) call avoids the Segfault, wouldn't it be wrong to readd this call? Stefan, I do care because during booting of the system, you see the segfault. And that should not happen... It turned out that not the segfault is the issue here but the error message udevd-work[14069]: '/usr/bin/vmmouse_detect' unexpected exit with status 0x000b Could be that the udev rule is wrong. Perhaps we should just change the udev rule.
I tried the following - it gave no error but added the vmmouse:
ACTION=="add|change", ENV{ID_INPUT_MOUSE}=="?*", ATTRS{description}=="i8042 AUX port", RUN{fail_event_on_error}="/usr/bin/vmmouse_detect", ENV{ID_INPUT.tags}="vmmouse"
Kay, could you help us,, please?
See bug #610418 for the message I received during boot
Originally it had:
PROGRAM="/usr/bin/vmmouse_detect"
I also wonder why udev reports error 0x000b - while we had exit 1. Udev reports the raw status bytes, and 0x0b is a segfault. The binary seems broken, when it segfaults. And it event tries to fiddle around with signal handlers to catch a segfault. I have really no idea what this thing tries to do here. :)
Right, PROGRAM= should only be used if symlinks need to be named. Otherwise RUN+= should be used, because it runs after all event/device naming processing is done. But note, that you want +=, or you reset all possibly added earlier RUN keys. (It might not make a difference here, now that we have devtmpfs in 11.3, and the kernel has already created the node for us -- this wasn't the case before devtmpfs, and the device was never accessible with PROGRAM=)
ATTRS{} should always have a corresponding match on the subsystem the attribute is coming from, like SUBSYSTEMS="serio", otherwise we will unefficiently check all parents in the whole sysfs path for this attribute. (This is more about efficiency, and should not make any other difference.)
For ACTION= we usually use !="remove", because stuff should run on any possible event. (That's more a cosmetic thing.)
And can we please check DMI data or anything else that tells us that we run in vmware? Or is that useful for other things too? It's usually bad behavior to unconditionally fork binaries to probe something, which is not needed on any common system. It's like we would just load all kernel modules from /lib/modules just to check if they can find a device to drive. :)
> Leonardo, if you fix the Sig11 on a non-VWware virtual machine, the program
> behaves the same as in a VWware virtual machine. But the idea was to segfault
> and then exit(1) instead of exit(0).
I tested this only in a physical machine and the detection seems to work without the SEGV:
# /usr/bin/vmmouse_detect
# echo $?
1
(In reply to comment #22) > > Leonardo, if you fix the Sig11 on a non-VWware virtual machine, the program > > behaves the same as in a VWware virtual machine. But the idea was to > > segfault and then exit(1) instead of exit(0). > > I tested this only in a physical machine and the detection seems to work > without the SEGV: > > # /usr/bin/vmmouse_detect > # echo $? > 1 You don't see the segfault, because there is a signal handler for SIG11. In that signal handler there is the exit(1) call. This is done by intention. Guys, I have no idea how to proceed here. Any suggestions by VMWARE how to check differently for a VMWARE virtual machine? If not I suggest to close the bug again. This time as WONTFIX. Apparently udev cannot handle properly programs, which segfault by intention. Udev handles programs just fine which don't try to play games they, as it looks like, don't play right. First, this prober must not run on any normal box, it's a waste of time and resources. This like this must not be done, people try hard to get rid of crap to boot fast, and new stuff like this is coming back all the time. And to prevent this warning, it must not return an exit status that says it segfault'ed, but that it exited with a usual exit code. Can someone please try if that doesn't tell if it's a vmware guest? grep . /sys/class/dmi/id/* This is a udev trace while vmmouse_detect is called, and it does not look like it exits as expected:
[pid 16701] execve("/usr/bin/vmmouse_detect", ["/usr/bin/vmmouse_detect"], [/* 9 vars */]) = 0
...
[pid 16701] munmap(0x7f639faa3000, 97960) = 0
[pid 16701] rt_sigaction(SIGSEGV, {0x400590, [SEGV], SA_RESTORER|SA_RESTART, 0x7f639f570a60}, {SIG_DFL, [], 0}, 8) = 0
[pid 16701] --- SIGSEGV (Segmentation fault) @ 0 (0) ---
Process 16701 detached
[pid 16700] read(4, "", 1023) = 0
[pid 16700] close(4) = 0
[pid 16700] wait4(16701, [{WIFSIGNALED(s) && WTERMSIG(s) == SIGSEGV}], 0, NULL) = 16701
In my opinion, this bug is really about getting rid of the core files that are being generated on every boot. To do that, we can either put the iopl() call again OR redesign the way we detect a VM guest. I agree that the long term solution should be to not depend on some binary return code to detect that, but if we don't have the time to do this for 11.3, I suggest to just apply the iopl() patch. I did some tests here and I'd like to share the results: physical# strace -e trace=iopl,exit_group vmmouse_detect.orig --- SIGSEGV (Segmentation fault) @ 0 (0) --- exit_group(1) physical# strace -e trace=iopl,exit_group vmmouse_detect.iopl iopl(0x3) = 0 exit_group(1) = ? kvm-guest# strace -e trace=iopl,exit_group vmmouse_detect.orig --- SIGSEGV (Segmentation fault) @ 0 (0) --- exit_group(1) = ? kvm-guest# strace -e trace=iopl,exit_group vmmouse_detect.iopl iopl(0x3) = 0 exit_group(0) = ? vmware-guest# strace -e trace=iopl,exit_group vmmouse_detect.iopl iopl(0x3) = 0 exit_group(0) = ? As you can see, vmmouse_detect without the iopl() call also doesn't work on KVM guests. This problem was fixed on Debian with the same patch: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=525039 Ok. I've just (re)added the iopl(3) patch. I verified that vmmouse_detect exits still with return code 1 on a physical machine. So that change shouldn't hurt.
41093 State:new By:sndirsch When:2010-06-04T12:32:44
submit: X11:XOrg/xorg-x11-driver-input -> openSUSE:Factory
Descr: - xf86-input-vmmouse-iopl.diff (reverse applied)
* readd iopl(3) call (bnc #604966)
(In reply to comment #26) > Can someone please try if that doesn't tell if it's a vmware guest? > grep . /sys/class/dmi/id/* Leonardo, any chance to provide this information? I'm asking you, since according to comment #28 you have/had access to a vmware guest. I'm afraid I'm losing track of exactly what information you'd like to get from me. I think it would be good to focus on the immediate problem(s) here and discuss any further possible improvements upstream.
Another update from Philip:
> I have no objection to putting the iopl() call back in, as long as it's
> properly guarded for Linux (obviously).
>
> Second, Kay's grumbling about using dmi. I actually had a dmi test in there
> originally but I removed it because the qemu/kvm people got sulky.
>
> http://cgit.freedesktop.org/xorg/driver/xf86-input-vmmouse/commit/?id=b29b45a25b3b2db58f81e727d787c337bbd87637
I don't know how to continue here either, but thanks a lot for the additional input by Philip. Very much appreciated. (In reply to comment #31) > > Second, Kay's grumbling about using dmi. I actually had a dmi test in there > > originally but I removed it because the qemu/kvm people got sulky. > > > > http://cgit.freedesktop.org/xorg/driver/xf86-input-vmmouse/commit/?id=b29b45a25b3b2db58f81e727d787c337bbd87637 Indeed I remember that qemu/kvm want to make use of vmmouse as well. Maybe agraf has an idea how to detect a vmware/qemu virtual machine in a different way, i.e. by avoiding a segfault. FWIW probing the port is the only reliable method to probe for the vmport. So we _have_ to take the segfault. Btw if intercepting segfaults breaks the boehm garbage collector would break too, no? (In reply to comment #34) > FWIW probing the port is the only reliable method to probe for the vmport. So > we _have_ to take the segfault. Btw if intercepting segfaults breaks the boehm > garbage collector would break too, no? Thanks for the input, Alex. I couldn't understand the second sentence though. Could you rephrase it? AFAICS this is becoming a WONTFIX after all. Well from what I understood the fundamental issue here is that handling a segfault from within the program doesn't work. Apart from this code relying on it, there is also a garbage collector for C called boehm. If I remember correctly, that one also relies on handling segfaults itself, so it would break if that doesn't work. > AFAICS this is becoming a WONTFIX after all.
Why WONTFIX? For me this is already FIXED (request #41093 was accepted).
Maybe it's fixed for you, but likely not for Andreas Jaeger. Or did this message vanish suddenly (duplicate Bug #610418)? "During bootup I see the following on the console and later in the log file: May 31 16:11:24 x61s-aj udevd-work[14069]: '/usr/bin/vmmouse_detect' unexpected exit with status 0x000b" Maybe he was not using the fixed package when he reported that bug? I don't see it with current package anymore, so let's mark it as fixed. Thanks! |
I'm seeing these core files around (probably one generated on each boot). Core was generated by `/usr/bin/vmmouse_detect'. Program terminated with signal 11, Segmentation fault. #0 0x00000000004007d8 in VMMouseProtoInOut (cmd=0x0) at vmmouse_proto.c:62 62 __asm__ __volatile__( (gdb) list 57 VMMouseProtoInOut(VMMouseProtoCmd *cmd) // IN/OUT 58 { 59 #ifdef __x86_64__ 60 uint64_t dummy; 61 62 __asm__ __volatile__( 63 "pushq %%rax" "\n\t" 64 "movq 40(%%rax), %%rdi" "\n\t" 65 "movq 32(%%rax), %%rsi" "\n\t" 66 "movq 24(%%rax), %%rdx" "\n\t" gdb) bt #0 0x00000000004007d8 in VMMouseProtoInOut (cmd=0x0) at vmmouse_proto.c:62 #1 VMMouseProto_SendCmd (cmd=0x0) at vmmouse_proto.c:146 #2 0x0000000000000000 in ?? ()