Bug 604966

Summary: vmmouse_detect seg faults at vmmouse_proto.c:62
Product: [openSUSE] openSUSE 11.3 Reporter: Leonardo Chiquitto <lchiquitto>
Component: X.OrgAssignee: Stefan Dirsch <sndirsch>
Status: RESOLVED FIXED QA Contact: E-mail List <xorg-maintainer-bugs>
Severity: Normal    
Priority: P3 - Medium CC: agraf, aj, michel
Version: Factory   
Target Milestone: ---   
Hardware: x86-64   
OS: openSUSE 11.3   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: vmmouse_detect.core
suggested patch

Description Leonardo Chiquitto 2010-05-11 21:56:37 UTC
I'm seeing these core files around (probably one generated on each boot).

Core was generated by `/usr/bin/vmmouse_detect'.
Program terminated with signal 11, Segmentation fault.
#0  0x00000000004007d8 in VMMouseProtoInOut (cmd=0x0) at vmmouse_proto.c:62
62	   __asm__ __volatile__(

(gdb) list
57	VMMouseProtoInOut(VMMouseProtoCmd *cmd) // IN/OUT
58	{
59	#ifdef __x86_64__
60	   uint64_t dummy;
61	
62	   __asm__ __volatile__(
63	        "pushq %%rax"           "\n\t"
64	        "movq 40(%%rax), %%rdi" "\n\t"
65	        "movq 32(%%rax), %%rsi" "\n\t"
66	        "movq 24(%%rax), %%rdx" "\n\t"

gdb) bt
#0  0x00000000004007d8 in VMMouseProtoInOut (cmd=0x0) at vmmouse_proto.c:62
#1  VMMouseProto_SendCmd (cmd=0x0) at vmmouse_proto.c:146
#2  0x0000000000000000 in ?? ()
Comment 1 Leonardo Chiquitto 2010-05-11 21:57:58 UTC
Created attachment 361512 [details]
vmmouse_detect.core
Comment 3 Stefan Dirsch 2010-05-12 07:54:30 UTC
But the vmmouse driver is running?
Comment 4 Leonardo Chiquitto 2010-05-12 14:12:55 UTC
I don't think it is (this is a physical machine and I couldn't find references on boot or Xorg logs). Anyway, how can I check if this driver is active?
Comment 5 Stefan Dirsch 2010-05-12 14:31:41 UTC
Look for vmmouse in /var/log/Xorg.0.log (in the guest system).
Comment 6 Leonardo Chiquitto 2010-05-12 14:44:00 UTC
This is a physical machine with no guests running. Also, there are no references to vmmouse in /var/log/Xorg*.

It seems udev [1] is running vmmouse_detect on every boot when it detects an i8042 AUX port.

[1] /lib/udev/rules.d/69-xorg-vmmouse.rules
Comment 7 Stefan Dirsch 2010-05-13 08:24:59 UTC
The segfaulted is expected and handled accordingly.

xf86-input-vmmouse/tools/vmmouse_detect.c:

[...]
void
segvCB(int sig)
{
#if defined HAVE_XORG_SERVER_1_1_0
   exit(1);
#endif
}


int
main(void)
{
   /*
    * If the vmmouse test is not run in a VMware virtual machine, it
    * will segfault instead of successfully accessing the port.
    */
   signal(SIGSEGV, segvCB);
[...]
Comment 8 Stefan Dirsch 2010-05-31 15:25:38 UTC
*** Bug 610418 has been marked as a duplicate of this bug. ***
Comment 9 Andreas Jaeger 2010-06-01 07:52:47 UTC
It should either not segfault  or it should not be run by default.

Currently this is just broken, you get core dumps or segfaults at every system start.
Comment 10 Stefan Dirsch 2010-06-01 08:44:41 UTC
AFAICS if we wouldn't run it the vmmouse driver wouldn't be used, since no vmmouse device wouldn't be available (see /lib/udev/rules.d/69-xorg-vmmouse.rules). If this the way to go, please let me know.

Is there a different way to figure out if we're in a VMware virtual machine? I don't know. I guess there is a reason why VMWARE implemented the detection that
way.

Michel (@VMWARE), any hints would be appreciated here. Is Philip Langdale, the
original author of vmouse_detect, still working at VMWARE? Could we contact
him directly?
Comment 11 Leonardo Chiquitto 2010-06-01 12:10:19 UTC
Created attachment 366092 [details]
suggested patch

Although I can't explain why, this patch resolves the problem for me.
Comment 12 Stefan Dirsch 2010-06-01 12:16:42 UTC
(In reply to comment #11)
> Created an attachment (id=366092) [details]
> suggested patch
> 
> Although I can't explain why, this patch resolves the problem for me.

Subject: Not calling iopl() is triggering an undesired SEGV
References: bnc#604966

  Reverts the following upstream commit:

  commit bcdec3d0cd4434770cd841c33c030e0d7203881f
  Author: Philip Langdale <philipl@fido2.homeip.net>
  Date:   Thu Oct 23 23:35:28 2008 -0700

    Remove call to iopl(). It's not portable and isn't necessary.

diff --git a/tools/vmmouse_detect.c b/tools/vmmouse_detect.c
index e5f14a3..0dd4827 100644
--- a/tools/vmmouse_detect.c
+++ b/tools/vmmouse_detect.c
@@ -47,6 +47,11 @@ main(void)
    signal(SIGSEGV, segvCB);
 
 #if defined __i386__ || defined __x86_64__ 
+   /*
+    * To access i/o ports above 0x3ff, we need to be in iopl(3).
+    */
+
+   iopl(3);
    if (VMMouseClient_Enable()) {
       VMMouseClient_Disable();
       return 0;

And the upstream commit was:

commit bcdec3d0cd4434770cd841c33c030e0d7203881f
Author: Philip Langdale <philipl@fido2.homeip.net>
Date:   Thu Oct 23 23:35:28 2008 -0700

    Remove call to iopl(). It's not portable and isn't necessary.

diff --git a/tools/vmmouse_detect.c b/tools/vmmouse_detect.c
index e5f14a3..0dd4827 100644
--- a/tools/vmmouse_detect.c
+++ b/tools/vmmouse_detect.c
@@ -47,11 +47,6 @@ main(void)
    signal(SIGSEGV, segvCB);
 
 #if defined __i386__ || defined __x86_64__ 
-   /*
-    * To access i/o ports above 0x3ff, we need to be in iopl(3).
-    */
-
-   iopl(3);
    if (VMMouseClient_Enable()) {
       VMMouseClient_Disable();
       return 0;
Comment 13 Leonardo Chiquitto 2010-06-02 00:26:28 UTC
Test packages available in:

http://download.opensuse.org/repositories/home:/leonardocf:/branches:/X11:/XOrg/openSUSE_Factory/
Comment 14 Stefan Dirsch 2010-06-02 08:04:21 UTC
Leonardo, if you fix the Sig11 on a non-VWware virtual machine, the program behaves the same as in a VWware virtual machine. But the idea was to segfault
and then exit(1) instead of exit(0). So now always the vmmouse ID_INPUT tag is added to udev, which results in using the vmmouse driver on all systems. This does
not make sense to me. Or I did not understand the basics here ...
Comment 15 Michel Dänzer 2010-06-02 08:32:12 UTC
Philip doesn't have an account here but can be reached at plangdale at vmware dot com. Here's his initial response:

> Anyway, the issue here was slightly convoluted. The standard detection
> mechanism we use has always been to do this port-poke, and if it's
> not a VM, you get a segfault - and you do need iopl() set to allow that
> to work. When I published the vmmouse source, I had the iopl() call in
> there and that was an issue for non-Linux operating systems, but I also
> observed that it wasn't really necessary because the X server did iopl()
> itself - not really a surprise. Later on, I added vmmouse_detect to allow
> the HAL/udev based device detection to work, and those are standalone and
> so what the X server does is irrelevant.
> 
> Now, why isn't this an issue for anyone else? It does the detection
> correctly on Ubuntu and other distros (and no one cares about the segfault
> in the failure case). I guess they have funny core handling going on?
> 
> Anyway, it seems the right fix is to add the iopl() call in, perhaps only
> in vmmouse_detect as it's still irrelevant in the X server. It also doesn't
> matter in the X server now as the driver isn't loaded unless the device is
> detected.
> 
> It also needs to be properly guarded for LINUX. Who knows what's up on BSD
> or similar.
Comment 16 Stefan Dirsch 2010-06-02 09:21:26 UTC
Thanks a lot for the input, Michel. Greetings to Philip. I believe that helped a lot. No need to contact him directly. Also he should have read permissions to this bugreport. 

Personally I don't care about a segfault to detect a VMWARE virtual machine. Apparently Andreas Jaeger does. I don't know why.

What I still don't understand. If iopl(3) call avoids the Segfault, wouldn't it be
wrong to readd this call?
Comment 17 Andreas Jaeger 2010-06-02 09:46:03 UTC
Stefan, I do care because during booting of the system, you see the segfault.  And that should not happen...
Comment 18 Stefan Dirsch 2010-06-02 10:27:38 UTC
It turned out that not the segfault is the issue here but the error message

 udevd-work[14069]: '/usr/bin/vmmouse_detect' unexpected exit with status 0x000b

Could be that the udev rule is wrong.
Comment 19 Andreas Jaeger 2010-06-02 10:33:16 UTC
Perhaps we should just change the udev rule.

I tried the following - it gave no error but added the vmmouse:
ACTION=="add|change", ENV{ID_INPUT_MOUSE}=="?*", ATTRS{description}=="i8042 AUX port", RUN{fail_event_on_error}="/usr/bin/vmmouse_detect", ENV{ID_INPUT.tags}="vmmouse"

Kay, could you help us,, please?
See bug #610418 for the message I received during boot

Originally it had:
PROGRAM="/usr/bin/vmmouse_detect"
Comment 20 Andreas Jaeger 2010-06-02 10:34:05 UTC
I also wonder why udev reports error 0x000b - while we had exit 1.
Comment 21 Kay Sievers 2010-06-02 11:24:34 UTC
Udev reports the raw status bytes, and 0x0b is a segfault. The binary seems broken, when it segfaults. And it event tries to fiddle around with signal handlers to catch a segfault. I have really no idea what this thing tries to do here. :)

Right, PROGRAM= should only be used if symlinks need to be named. Otherwise RUN+= should be used, because it runs after all event/device naming processing is done. But note, that you want +=, or you reset all possibly added earlier RUN keys. (It might not make a difference here, now that we have devtmpfs in 11.3, and the kernel has already created the node for us -- this wasn't the case before devtmpfs, and the device was never accessible with PROGRAM=)

ATTRS{} should always have a corresponding match on the subsystem the attribute is coming from, like SUBSYSTEMS="serio", otherwise we will unefficiently check all parents in the whole sysfs path for this attribute. (This is more about efficiency, and should not make any other difference.)

For ACTION= we usually use !="remove", because stuff should run on any possible event. (That's more a cosmetic thing.)

And can we please check DMI data or anything else that tells us that we run in vmware? Or is that useful for other things too? It's usually bad behavior to unconditionally fork binaries to probe something, which is not needed on any common system. It's like we would just load all kernel modules from /lib/modules just to check if they can find a device to drive. :)
Comment 22 Leonardo Chiquitto 2010-06-02 11:37:52 UTC
> Leonardo, if you fix the Sig11 on a non-VWware virtual machine, the program
> behaves the same as in a VWware virtual machine. But the idea was to segfault
> and then exit(1) instead of exit(0).

I tested this only in a physical machine and the detection seems to work without the SEGV:

  # /usr/bin/vmmouse_detect
  # echo $?
  1
Comment 23 Stefan Dirsch 2010-06-02 19:29:55 UTC
(In reply to comment #22)
> > Leonardo, if you fix the Sig11 on a non-VWware virtual machine, the program
> > behaves the same as in a VWware virtual machine. But the idea was to 
> > segfault and then exit(1) instead of exit(0).
> 
> I tested this only in a physical machine and the detection seems to work
> without the SEGV:
> 
>   # /usr/bin/vmmouse_detect
>   # echo $?
>   1

You don't see the segfault, because there is a signal handler for SIG11. In that signal handler there is the exit(1) call. This is done by intention.
Comment 24 Stefan Dirsch 2010-06-02 19:46:34 UTC
Guys, I have no idea how to proceed here. Any suggestions by VMWARE how to check differently for a VMWARE virtual machine? If not I suggest to close the bug again.
This time as WONTFIX. Apparently udev cannot handle properly programs, which segfault by intention.
Comment 25 Kay Sievers 2010-06-02 19:54:34 UTC
Udev handles programs just fine which don't try to play games they, as it looks like, don't play right.

First, this prober must not run on any normal box, it's a waste of time and resources. This like this must not be done, people try hard to get rid of crap to boot fast, and new stuff like this is coming back all the time.

And to prevent this warning, it must not return an exit status that says it segfault'ed, but that it exited with a usual exit code.
Comment 26 Kay Sievers 2010-06-02 20:05:38 UTC
Can someone please try if that doesn't tell if it's a vmware guest?
  grep . /sys/class/dmi/id/*
Comment 27 Kay Sievers 2010-06-02 23:41:42 UTC
This is a udev trace while vmmouse_detect is called, and it does not look like it exits as expected:


[pid 16701] execve("/usr/bin/vmmouse_detect", ["/usr/bin/vmmouse_detect"], [/* 9 vars */]) = 0
...
[pid 16701] munmap(0x7f639faa3000, 97960) = 0
[pid 16701] rt_sigaction(SIGSEGV, {0x400590, [SEGV], SA_RESTORER|SA_RESTART, 0x7f639f570a60}, {SIG_DFL, [], 0}, 8) = 0
[pid 16701] --- SIGSEGV (Segmentation fault) @ 0 (0) ---
Process 16701 detached
[pid 16700] read(4, "", 1023)           = 0
[pid 16700] close(4)                    = 0
[pid 16700] wait4(16701, [{WIFSIGNALED(s) && WTERMSIG(s) == SIGSEGV}], 0, NULL) = 16701
Comment 28 Leonardo Chiquitto 2010-06-03 00:04:10 UTC
In my opinion, this bug is really about getting rid of the core files that are being generated on every boot. To do that, we can either put the iopl() call again OR redesign the way we detect a VM guest. I agree that the long term solution should be to not depend on some binary return code to detect that, but if we don't have the time to do this for 11.3, I suggest to just apply the iopl() patch.

I did some tests here and I'd like to share the results:

physical# strace -e trace=iopl,exit_group vmmouse_detect.orig
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
exit_group(1)

physical# strace -e trace=iopl,exit_group vmmouse_detect.iopl
iopl(0x3)                               = 0
exit_group(1)                           = ?

kvm-guest# strace -e trace=iopl,exit_group vmmouse_detect.orig
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
exit_group(1)                           = ?

kvm-guest# strace -e trace=iopl,exit_group vmmouse_detect.iopl
iopl(0x3)                               = 0
exit_group(0)                           = ?

vmware-guest# strace -e trace=iopl,exit_group vmmouse_detect.iopl
iopl(0x3)                               = 0
exit_group(0)                           = ?

As you can see, vmmouse_detect without the iopl() call also doesn't work on KVM guests. This problem was fixed on Debian with the same patch:

  http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=525039
Comment 29 Stefan Dirsch 2010-06-04 10:34:06 UTC
Ok. I've just (re)added the iopl(3) patch. I verified that vmmouse_detect exits still with return code 1 on a physical machine. So that change shouldn't hurt.

41093  State:new     By:sndirsch     When:2010-06-04T12:32:44
        submit:       X11:XOrg/xorg-x11-driver-input  ->  openSUSE:Factory       
        Descr: - xf86-input-vmmouse-iopl.diff (reverse applied)
                 * readd iopl(3) call (bnc #604966)
Comment 30 Stefan Dirsch 2010-06-04 10:38:30 UTC
(In reply to comment #26)
> Can someone please try if that doesn't tell if it's a vmware guest?
>   grep . /sys/class/dmi/id/*

Leonardo, any chance to provide this information? I'm asking you, since according to comment #28 you have/had access to a vmware guest.
Comment 31 Michel Dänzer 2010-06-04 17:14:37 UTC
I'm afraid I'm losing track of exactly what information you'd like to get from me. I think it would be good to focus on the immediate problem(s) here and discuss any further possible improvements upstream.

Another update from Philip:

> I have no objection to putting the iopl() call back in, as long as it's
> properly guarded for Linux (obviously). 
> 
> Second, Kay's grumbling about using dmi. I actually had a dmi test in there
> originally but I removed it because the qemu/kvm people got sulky.
> 
> http://cgit.freedesktop.org/xorg/driver/xf86-input-vmmouse/commit/?id=b29b45a25b3b2db58f81e727d787c337bbd87637
Comment 32 Stefan Dirsch 2010-06-04 18:58:05 UTC
I don't know how to continue here either, but thanks a lot for the additional input by Philip. Very much appreciated.
Comment 33 Stefan Dirsch 2010-06-05 09:09:44 UTC
(In reply to comment #31)
> > Second, Kay's grumbling about using dmi. I actually had a dmi test in there
> > originally but I removed it because the qemu/kvm people got sulky.
> > 
> > http://cgit.freedesktop.org/xorg/driver/xf86-input-vmmouse/commit/?id=b29b45a25b3b2db58f81e727d787c337bbd87637

Indeed I remember that qemu/kvm want to make use of vmmouse as well. Maybe agraf has an idea how to detect a vmware/qemu virtual machine in a different way, i.e.
by avoiding a segfault.
Comment 34 Alexander Graf 2010-06-05 14:29:32 UTC
FWIW probing the port is the only reliable method to probe for the vmport. So we _have_ to take the segfault. Btw if intercepting segfaults breaks the boehm garbage collector would break too, no?
Comment 35 Stefan Dirsch 2010-06-11 12:41:50 UTC
(In reply to comment #34)
> FWIW probing the port is the only reliable method to probe for the vmport. So
> we _have_ to take the segfault. Btw if intercepting segfaults breaks the boehm
> garbage collector would break too, no?

Thanks for the input, Alex. I couldn't understand the second sentence though. Could you rephrase it? 

AFAICS this is becoming a WONTFIX after all.
Comment 36 Alexander Graf 2010-06-11 12:52:24 UTC
Well from what I understood the fundamental issue here is that handling a segfault from within the program doesn't work.

Apart from this code relying on it, there is also a garbage collector for C called boehm. If I remember correctly, that one also relies on handling segfaults itself, so it would break if that doesn't work.
Comment 37 Leonardo Chiquitto 2010-06-11 12:57:02 UTC
> AFAICS this is becoming a WONTFIX after all.

Why WONTFIX? For me this is already FIXED (request #41093 was accepted).
Comment 38 Stefan Dirsch 2010-06-11 13:37:13 UTC
Maybe it's fixed for you, but likely not for Andreas Jaeger. Or did this message vanish suddenly (duplicate Bug #610418)?

"During bootup I see the following on the console and later in the log file:
May 31 16:11:24 x61s-aj udevd-work[14069]: '/usr/bin/vmmouse_detect' unexpected
exit with status 0x000b"
Comment 39 Leonardo Chiquitto 2010-06-11 14:38:03 UTC
Maybe he was not using the fixed package when he reported that bug?
Comment 40 Andreas Jaeger 2010-06-14 09:53:57 UTC
I don't see it with current package anymore, so let's mark it as fixed. Thanks!