Bug 85416

Summary: Compatibility issue between SCIM gtkimmodule and gtk applications which are linked to old compat libstdc++ libraries.
Product: [openSUSE] SUSE LINUX 10.0 Reporter: Zhe Su <zsu>
Component: BasesystemAssignee: Mike Fabian <mfabian>
Status: RESOLVED FIXED QA Contact: E-mail List <qa-bugs>
Severity: Critical    
Priority: P5 - None CC: aj, rguenther
Version: Stable GCC Snapshot1   
Target Milestone: ---   
Hardware: Other   
OS: All   
Whiteboard:
Found By: Other Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: Source code of the test app.
Binary of the test app, compiled by g++3.3.5
Valgrind memcheck output of entry on Snapshot2.
Symbols which without version-script.
Symbols which has version file applied.
New symbol list of libscim, which doesn't have all WEAK std symbols.
New symbol list of im-scim, without all WEAK std symbols.
Valgrind memcheck output of acroread on Snapshot2.

Description Zhe Su 2005-05-23 03:55:25 UTC
scim is written with C++, so it's linked to libstdc++.so.6 in SUSE 10. But some
commercial applications in SUSE 10, which uses gtk2 widget library, are linked
to libstdc++.so.5, for example:

acroread 7.0
realplay 10.0

It prevents acroread and realplay from being started if scim gtkimmodule is used.

The result and backtrace of realplay is:

*** glibc detected *** double free or corruption (out): 0x08335a08 ***

Program received signal SIGABRT, Aborted.
0xffffe410 in __kernel_vsyscall ()
(gdb) bt
#0  0xffffe410 in __kernel_vsyscall ()
#1  0x405e22a1 in raise () from /lib/tls/libc.so.6
#2  0x405e3a8b in abort () from /lib/tls/libc.so.6
#3  0x40617755 in __fsetlocking () from /lib/tls/libc.so.6
#4  0x4061d622 in malloc_usable_size () from /lib/tls/libc.so.6
#5  0x4061e044 in free () from /lib/tls/libc.so.6
#6  0x4125ab71 in operator delete () from /usr/lib/libstdc++.so.6
#7  0x41237a7d in std::string::_Rep::_M_destroy () from /usr/lib/libstdc++.so.6
#8  0x41238c08 in std::basic_string<char, std::char_traits<char>,
std::allocator<char> >::~basic_string ()
   from /usr/lib/libstdc++.so.6
#9  0x438bbaa0 in scim::Module::~Module () from /usr/lib/libscim-1.0.so.6
#10 0x438bbe17 in scim::scim_get_module_list () from /usr/lib/libscim-1.0.so.6
#11 0x438b7be5 in scim::scim_get_imengine_module_list () from
/usr/lib/libscim-1.0.so.6
#12 0x438438cb in gtk_im_context_scim_shutdown () from
/opt/gnome/lib/gtk-2.0/immodules/im-scim.so
#13 0x40232b5b in g_type_class_ref () from /opt/gnome/lib/libgobject-2.0.so.0
#14 0x402191c8 in g_object_newv () from /opt/gnome/lib/libgobject-2.0.so.0
#15 0x402195c7 in g_object_new_valist () from /opt/gnome/lib/libgobject-2.0.so.0
#16 0x40219780 in g_object_new () from /opt/gnome/lib/libgobject-2.0.so.0
#17 0x4383a474 in gtk_im_context_scim_new () from
/opt/gnome/lib/gtk-2.0/immodules/im-scim.so
#18 0x4384868c in im_module_create () from
/opt/gnome/lib/gtk-2.0/immodules/im-scim.so
#19 0x403d7dbb in gtk_im_context_simple_add_table () from
/opt/gnome/lib/libgtk-x11-2.0.so.0
#20 0x403d85c2 in gtk_im_multicontext_new () from /opt/gnome/lib/libgtk-x11-2.0.so.0
#21 0x403d8837 in gtk_im_multicontext_new () from /opt/gnome/lib/libgtk-x11-2.0.so.0
#22 0x403d6601 in gtk_im_context_set_cursor_location () from
/opt/gnome/lib/libgtk-x11-2.0.so.0
#23 0x4048a3a2 in gtk_text_view_get_default_attributes () from
/opt/gnome/lib/libgtk-x11-2.0.so.0
#24 0x4048b44c in gtk_text_view_get_default_attributes () from
/opt/gnome/lib/libgtk-x11-2.0.so.0
#25 0x403f6b3b in gtk_marshal_VOID__UINT_STRING () from
/opt/gnome/lib/libgtk-x11-2.0.so.0
#26 0x40212b47 in g_cclosure_new_swap () from /opt/gnome/lib/libgobject-2.0.so.0
#27 0x402130dc in g_closure_invoke () from /opt/gnome/lib/libgobject-2.0.so.0


Recompiling those apps with gcc 4 or forcing to use XIM would solve this issue.
But I think it's better to solve it in scim gtkimmodule, so that all other
similar apps would be ok as well.

Do you have any idea?
Comment 1 Michael Matz 2005-05-24 12:50:34 UTC
Not really.  As you noticed you can't really mix two different libstdc++ 
in one process.  It would maybe work if one is linked statically.  I also 
see only some of the sublibraries of acroread7 being dynamically linked 
against libstdc++.so.5, e.g. the main executable is (and most of the other 
libs are) not.  So, are you sure that this is indeed a problem of mixing 
libstdc++, and not some unrelated bug in scim? 
 
The only other solution I can think of apart from rebuilding the 
application to link against the right libstdc++ (acrobat will somewhen have 
to do this anyway), is to provide two scim gtkimmodules, one linked against 
libstdc++.so.5 (i.e. from an old SuSE version).  But actually I don't 
think this is very feasible. 
Comment 2 Mike Fabian 2005-05-24 15:01:58 UTC
forcing acroread to use XIM is rather easy because it is started
with a script where we could set GTK_IM_MODULE=xim.

Comment 3 Zhe Su 2005-05-25 02:39:50 UTC
Forcing these apps to use XIM is the easiest way to workaround this issue. But
we couldn't know that whether there are other apps which have similar issue.

You see that the app will always crash in std::basic_string, it's nothing to do
with scim code.

But according to the API doc of libstdc++ 6. One application could link to two
libraries which are linked against libstdc++.so.5 and libstdc++.so.6. See:

file:///usr/share/doc/packages/libstdc++-devel-mainline/html/abi.html
...
 Testing Multi-ABI binaries

A "C" application, dynamically linked to two shared libraries, liba, libb. The
dependent library liba is C++ shared library compiled with gcc-3.3.x, and uses
io, exceptions, locale, etc. The dependent library libb is a C++ shared library
compiled with gcc-3.4.x, and also uses io, exceptions, locale, etc.

As above, libone is constructed as follows:

%$bld/H-x86-gcc-3.4.0/bin/g++ -fPIC -DPIC -c a.cc

%$bld/H-x86-gcc-3.4.0/bin/g++ -shared -Wl,-soname -Wl,libone.so.1 -Wl,-O1
-Wl,-z,defs a.o -o libone.so.1.0.0

%ln -s libone.so.1.0.0 libone.so

%$bld/H-x86-gcc-3.4.0/bin/g++ -c a.cc

%ar cru libone.a a.o 

And, libtwo is constructed as follows:

%$bld/H-x86-gcc-3.3.3/bin/g++ -fPIC -DPIC -c b.cc

%$bld/H-x86-gcc-3.3.3/bin/g++ -shared -Wl,-soname -Wl,libtwo.so.1 -Wl,-O1
-Wl,-z,defs b.o -o libtwo.so.1.0.0

%ln -s libtwo.so.1.0.0 libtwo.so

%$bld/H-x86-gcc-3.3.3/bin/g++ -c b.cc

%ar cru libtwo.a b.o 

...with the resulting libraries looking like

%ldd libone.so.1.0.0
        libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x40016000)
        libm.so.6 => /lib/tls/libm.so.6 (0x400fa000)
        libgcc_s.so.1 => /mnt/hd/bld/gcc/gcc/libgcc_s.so.1 (0x4011c000)
        libc.so.6 => /lib/tls/libc.so.6 (0x40125000)
        /lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x00355000)

%ldd libtwo.so.1.0.0
        libstdc++.so.5 => /usr/lib/libstdc++.so.5 (0x40027000)
        libm.so.6 => /lib/tls/libm.so.6 (0x400e1000)
        libgcc_s.so.1 => /mnt/hd/bld/gcc/gcc/libgcc_s.so.1 (0x40103000)
        libc.so.6 => /lib/tls/libc.so.6 (0x4010c000)
        /lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x00355000)


Then, the "C" compiler is used to compile a source file that uses functions from
each library.

gcc test.c -g -O2 -L. -lone -ltwo /usr/lib/libstdc++.so.5 /usr/lib/libstdc++.so.6

Which gives the expected:

%ldd a.out
        libstdc++.so.5 => /usr/lib/libstdc++.so.5 (0x00764000)
        libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x40015000)
        libc.so.6 => /lib/tls/libc.so.6 (0x0036d000)
        libm.so.6 => /lib/tls/libm.so.6 (0x004a8000)
        libgcc_s.so.1 => /mnt/hd/bld/gcc/gcc/libgcc_s.so.1 (0x400e5000)
        /lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x00355000)

This resulting binary, when executed, will be able to safely use code from both
liba, and the dependent libstdc++.so.6, and libb, with the dependent
libstdc++.so.5. 
Comment 4 Michael Matz 2005-05-25 15:08:47 UTC
Note that this example is about a C (not C++) main program.  And this all 
anyway only works, if there is _no_ communication using C++ structures 
between the two libraries.  For instance passing a C++ string from one 
lib to the other will not work. 
 
Having said that the backtrace indeed looks a bit like it should work, that's 
why I asked if you are sure, that it's not just a simple bug in the SCIM 
gtkimmodule.  The scim::Module::~Module dtor seems to destruct some 
std::string, and that destroying seems to be confused.  But I only see 
libstdc++.so.6 symbols in the backtrace, so if the strings also was 
_constructed_ with libstdc++.so.6 means (which we can't know from the 
backtrace), then it's not libstdc++.so.5 which confuses everything, but 
something else. 
 
Perhaps you can debug this some more, when installing the source of libstdc++ 
and the scim gtkimmodule. 
Comment 5 Zhe Su 2005-07-04 07:51:11 UTC
Hi, I found that there is no such issue in qt applications when using with
scim-qtimm cvs head version (qtimmodule).

I tried to run a qt3 application, which was compiled on SL9.3, on stable
snapshot2 (gcc4 system), it could be run and input without any problem.

So why such compatibility issue doesn't available in qt3 (c++) apps?
Comment 6 Michael Matz 2005-07-04 08:20:54 UTC
Could you reformulate your findings?  I fear I don't understand what you 
are saying.  To me it reads as if: 
1) no problem with qt3 app, when using scim-gtkimm CVS head 
2) no problem with qt3 app from 9.3 on gcc4 system. 
3) are you now asking why this is so?  I.e. why there is no problem 
   with the qt3 app (in comparison with the gtk2 app)? 
 
I can't know this.  You will have to debug the problem in the gtk2 one 
then.  But I can speculate that the problem there is, that it is 
somehow passing an object constructed in the code using libstdc++.so.5 (the 
main app) to code using libstdc++.so.6 (the plugin), which then fails. 
 
And the qt3 case would work, because the above does not happen, i.e. 
the app is not passing objects to the plugin, only plain old data types. 
 
But this is just speculation, you really have to debug yourself.  I don't 
have the apps you are talking about, and this is an unsupported scenario 
(using two libstdc++ in one image, one coming from a plugin).  If you want 
it to work you have to hack the necessary magic yourself, I only can be 
a general guide. 
Comment 7 Zhe Su 2005-07-04 08:42:03 UTC
I mean both 1) and 2).

And I promise that there is definitly no object exchanging between
libstdc++.so.5 part and libstdc++.so.6 part. Only Plain data exechanging.

The code of scim qtimmodule and gtkimmodule are very similar. So if gtk2 app
crashes because of such issue, qt3 app must crash as well.

And I found that some gtk2 apps always crash at operator+ of std::basic_string
class. So I tried to avoid using operator+ in scim, thus some apps (one of my
testing app and realplay) stop crashing and work without any problem.

But acroread still crash at another point (destructor of std::vector).

So I think there must be a bug in libstdc++.so.5 or libstdc++.so.6 or gtk2 which
will make the memory corrupted in this situation.

Comment 8 Michael Matz 2005-07-04 09:05:06 UTC
Okay, so the situation is, that qtimmodule works, while gtkimmodule does not, 
correct?  Why do you think this would be a bug in libstdc++ then?  It sounds 
more like a problem in gtkimmodule to me then. 
 
Note that just because this fragile thing works with a qt3 app, this does 
not mean, that it must also work in some other situation.  It could be just 
pure luck.  It still might be a bug in gtkimmodule, or anything it uses, 
but it also could be luck in the qtimmodule part. 
 
Have you tried running this all under valgrind?  It could point out some 
errors like double-frees or overwriting memory, or accessing uninitialized 
memory. 
 
You will also want to create a small selfcontaining testcase, if anyone 
should help you.  But up to now I have not seen any evidence pointing to 
a problem in libstdc++ itself. 
Comment 9 Zhe Su 2005-07-04 09:26:36 UTC
Ok thank you. I'll try valgrind.
Comment 10 Zhe Su 2005-07-04 10:34:19 UTC
Hi,
  I tried valgrind with my test app, and found a very weird issue.
libstdc++.so.5 was used instead of libstdc++.so.6 when initializing some
objects. That's the reason why it crashed.
  The test app source and binary as well as valgrind memcheck output will be
attached.
  It's a very simple gtk2 program without any c++ code in it. It was just
compiled and linked by g++ 3.3.5, in order to link libstdc++.so.5 against it.
  While, all scim code were compiled and linked on Snapshot 2 by g++4.


Comment 11 Zhe Su 2005-07-04 10:35:55 UTC
Created attachment 41014 [details]
Source code of the test app.

It's the source code of the test app. Though the suffix is .cpp, it's a pure C
program indeed.
Comment 12 Zhe Su 2005-07-04 10:37:29 UTC
Created attachment 41017 [details]
Binary of the test app, compiled by g++3.3.5

It's generated by the following commands:
g++ -c -O0 -g -o entry.o `pkg-config --cflags gtk+-2.0` entry.cpp 
g++ -O0 -g -o entry `pkg-config --libs gtk+-2.0` entry.o
Comment 13 Zhe Su 2005-07-04 10:41:00 UTC
Created attachment 41018 [details]
Valgrind memcheck output of entry on Snapshot2.

You may see that some objects were initialized by libstdc++.so.5 but freed by
libstdc++.so.6. But libstdc++.so.5 should not be used here.
Comment 14 Michael Matz 2005-07-04 12:18:36 UTC
How do I have to run the executable in my environment to see the crash? 
I have installed scim.rpm now, but just calling the program does not seem 
to load im-scim.so.  I've also tried 
% GTK_IM_MODULE=scim ./entry 
% GTK_IM_MODULE=im-scim ./entry 
 
but that also doesn't load it.  So how can I reproduce it? 
Comment 15 Zhe Su 2005-07-04 12:24:42 UTC
Hi,
  Please make sure that /etc/opt/gnome/gtk-2.0/gtk.immodules has the entry of
scim. You may call SuSEconfig to update it. then GTK_IM_MODULE=scim ./entry
should work.
Comment 16 Michael Matz 2005-07-04 13:27:29 UTC
Thanks, I can reproduce it now.  It's a real case that mixing two  
libstdc++ in one executable doesn't work.  The problem is the symbol  
_ZStplIcSt11char_traitsIcESaIcEESbIT_T0_T1_ERKS6_S8_ also known as  
"std::basic_string<char, std::char_traits<char>, std::allocator<char> >  
std::operator+<char, std::char_traits<char>, std::allocator<char>  
>(std::basic_string<char, std::char_traits<char>, std::allocator<char> >  
const&, std::basic_string<char, std::char_traits<char>, std::allocator<char> >  
const&)", i.e. "operator +" for strings (like you already noticed it creates  
problems).  
  
It is defined in libstdc++.so.5 like so:  
  _ZStplIcSt11char_traitsIcESaIcEESbIT_T0_T1_ERKS6_S8_@@GLIBCPP_3.2.1  
and in libstdc++.so.6 like so:  
  _ZStplIcSt11char_traitsIcESaIcEESbIT_T0_T1_ERKS6_S8_@@GLIBCXX_3.4  
i.e. with different versions.  So all is fine until here.  
  
_But_ this operator is implemented in an STL header, and also instantiated  
in the sources using these operators.  In this case it's instantiated in  
scim_global_config.cpp:137 .  These instantiations lead to weak definitions  
of that symbols, which in the end results in this weak definition in the  
im-scim.so DSO:  
  WEAK DEFAULT 10 _ZStplIcSt11char_traitsIcESaIcEESbIT_T0_T1_ERKS6_S8_  
  
Note how this definition does _not_ have a version associated with it.  
This is so because there is no linker symbol version file given when  
linking im-scim.so.  Of course not, as this symbol is not really provided  
by im-scim.so.  Instead its existence only is an artifact how g++ implements  
template instantiations.  
  
Now what further happens is, that the ELF symbol resolution rules take  
place.  A weak symbol is overridden by a non-weak definition, if there is  
non, then one of the weak definitions is chosen, in fact the first one  
in ELF symbols search order, which is a breadth first search of symbols  
over all loaded DSOs.  
  
libstdc++.so.5 is loaded before libstdc++.so.6 (which is only included  
later due to im-scim.so).  As the weak definition in im-scim.so does not  
contain a version it will be resolved to the first matching symbol of any  
version, and that is the one in libstdc++.so.5 .  
  
I.e. the wrong one will be chosen, because there is a weak definition of  
that symbol in im-scim.so.  If there had be _no_ definition, then during  
linking ld would have noted the version "GLIBCXX_3.4" for this undefined  
reference (as it was linked against libstdc++.so.6, which only provides  
this version).  Then at runtime this exact version would be searched,  
and the correct one in libstdc++.so.6 would have been found.  
  
This is what I meant with that two different versions of libstdc++ are not  
really supported in one process image.  Even if it happens to work under  
some circumstances, it might not in others.  For instance if the compiler  
had chose to not instantiate this "operator+" (perhaps by magically knowing  
that it will be provided by a library) all would have worked by luck.  
  
There is one hack one might try: Forcefully hide this symbol in im-scim.so.  
Then all references from within im-scim.so will be bound to this local  
(hidden) version of that symbol, which is the correct one, and it will not  
be exported, avoiding the problem of this weak definition.  
  
To do that, the most clean way would be to generate a symbol export file,  
in which you declare all symbols which are to be exported from im-scim.so  
explicitely, and make all other symbols local.  This will probably work around  
this problem.  Note that this is merely a hack.  It's good to explicitely  
define what is part of the API anyway, but using it to avoid this bug is a  
hack.  
  
See documentation of ld, --version-script option.  A version file looks like 
so: 
 
IM_SCIM_1.0 { 
  global: 
    extern "C++" { 
      MyClass::a*; 
    }; 
    some_c_symbol; 
    some_c_glob*; 
  local: 
    *; 
}; 
 
Note that you need to export also some mangled C++ symbols probably, 
if you export classes, because of the VTable and RTTI symbols. 
Comment 17 Zhe Su 2005-07-04 15:25:58 UTC
Could you please give me some more detail about this solution? I don't
understand it very well. Do you mean that I should use a local version of
operator+ rather than the external version provided by libstdc++?
Not only operator+ has this issue, but also others. I did try to avoid using
operator+, realplay works, but acroread still crashes on some other symbol.
So how can I know all of such symbols and make them into local scope?
Comment 18 Zhe Su 2005-07-04 15:27:52 UTC
And in qt3 case, libstdc++.so.6 would be loaded first because qt3 itself is
linked against it. So these weak symbols will be resolved correctly. Is it right?
Comment 19 Zhe Su 2005-07-04 15:38:09 UTC
And how about libscim-1.0.so? im-scim.so is linked against it, and those weak
symbols are used within this library indeed.
Should I apply this hack to libscim instead of im-scim.so?
Comment 20 Michael Matz 2005-07-04 15:46:50 UTC
If g++ instantiates template methods it creates an implementation of it 
in the currently compiled object (im-scim.so in this case) as weak symbol, 
i.e. it creates a real exported function for that instantiation.  It is 
also instantiated in libstdc++.so.[56] (under different versions, though, 
and with different code, expecting a different layout of the objects they 
work on).  Due to ELF symbol resolution you will get the wrong one for all 
references to it inside im-scim.so, which are still unresolved at starttime. 
 
So, the only hack around this is to create no unresolved references from 
inside im-scim.so.  In other words all references to that symbol must 
already be bound at link time.  The only definition to which it _can_ be 
bound at that time, is the symbol defined in im-scim.so itself (i.e. 
the instantiation created by g++ in im-scim.so). 
 
Another point is, that we don't want this symbol to be exported from 
im-scim.so at all.  It shouldn't provide it, as it doesn't belong to 
its API.  Hiding this symbol (making it "local") solves this problem (it 
won't be exported anymore), and also solves the issue above, the symbol 
binding to the self-defined version.  This is due to ELF semantics. 
During linking references to local hidden symbols are resolved directly 
to that symbol, and no external reference is left, which is what we want. 
 
So, the goal is to basically localize all symbols which are created just 
as artifact, and which don't belong to the ABI.  We don't have to know 
the exact symbol names for this, as long as we know the exact names 
of those symbols which we _do_ want to export (i.e. your API).  Then 
you create a linker script of the format from comment #16, listing 
all symbols you wanted to export under the "global:" part.  Note how symbol 
names can be globs (i.e. contain '*' matching an arbitrary number of 
characters).  Entries are read in order. 
 
So adding "local: *;" at the end means, "hide all other symbols not matched 
until now". 
 
To see which symbols there are you can use 'readelf -Ws im-scim.so'. 
Pipe it through c++filt to see the unmangled names.  For the symbols 
representing "vtable for ..." or "typeinfo ... for ..." you will want 
to list the mangled name in the linker script.  For the normal 
functions and members it's probably enough to just list "scim::*" under 
the global/C++ section of the linker file. 
 
By this you will only export an exact set of symbols instead of all of them. 
And at the same time you will internalize all references to now hidden 
symbols, so that not the reference from libstdc++.so.5 is used at run time. 
Comment 21 Michael Matz 2005-07-04 15:53:16 UTC
libscim-1.0.so seems to have the same problem, yes.  But it's not enough 
to hide them there, the references needs to be internalized for each DSO 
involved.  This is a lot of work (to determine the list of symbols which 
need to be exported).  And all this only because of trying to support 
something unsupportable ;-) 
 
If you have to do this for even more DSOs, it might in the end be easier 
to go with a positive list, i.e. say "localize these exact symbols, let 
all others be global".  Then you can list the usual suspect in the 
"local: ..." part of the linker script, followed by "global: *:, and 
can reuse it for all these DSOs.  I would then use readelf to verify that 
indeed none of the stl instantiation symbols is "GLOBAL" anymore.  I.e. 
the linker file would look somewhat like: 
 
VERS_1.0 { 
  local: _ZStplIcSt11char_traitsIcESaIcEESbIT_T0_T1_ERKS6_S8_; 
    more_symbols_from_your_list; 
    ... 
  global: *; 
}; 
Comment 22 Zhe Su 2005-07-05 05:58:07 UTC
Thank you very much. Now I understood it.
So is it possible to use unmangled symbol names in the version script file, so
maybe it should be enough to just make std::* into local scope.
And do you know how to use it along with libtool? Does libtool have similar option?
Comment 23 Michael Matz 2005-07-05 12:29:49 UTC
Look at comment #16.  You can use unmangled names like shown there, inside 
a 'extern "C++" { };' block.  But you should note, that the return 
type is part of the unmangled name normally, i.e. it sometimes reads 
"bool std::binary_search...." for instance.  I'm not sure if it contains 
the return type also for purpose of matching in the linker script, you 
would have to try.  If not, then yes, localizing 'std::*' is most probably 
enough.  Otherwise you will have to use the mangled name probably. 
In any case you will want to use readelf to verify that all symbols you 
wanted are indeed localized. 
 
How to use the version script is a linker option (see info ld): 
--version-script FILE.  If you use libtool you somehow must make it pass 
this option to your linker.  If your linker is g++ (as it should be for 
C++ libraries), then you need to prepend the usual -Wl, flag to make it 
pass the option to ld directly. 
Comment 24 Zhe Su 2005-07-05 16:10:53 UTC
Hi, thank you very much for your kindly help.

I used the following version script for libscim:
LIBSCIM_1.0 {
    global:
        extern "C++" {
            scim::*;
        };

    local:
        extern "C++" {
            std::*;
            __gnu_cxx::*;
        };
};

and the following script for im-scim.so:
IM_SCIM_1.0 {
    global:
        extern "C" {
            im_module*;
            gtk_im_context_scim_new*;
            gtk_im_context_scim_register_type*;
            gtk_im_context_scim_shutdown*;
        };

    local:
        extern "C++" {
            scim::*;
            std::*;
            __gnu_cxx::*
        };
};

The test app entry and realplay are ok now, but acroread still doesn't work. I
doubt that acroread has the similar weak symbols issue. So I think it's
impossible for us to fix it. Maybe forcing acroread to use xim instead of scim
is the only way we can go.
Comment 25 Michael Matz 2005-07-05 16:26:30 UTC
Hmm, and this does really work?  I wonder because you don't export 
any of the necessary RTTI or vtable symbols explicitely.  You should make 
a list of exported symbols without and with the version script, and diff it, 
to make really sure you don't accidentally hide necessary symbols.  (Or leave 
some std:: symbols visible, for that matter). 
 
Why do you doubt, that acroread does not have this issue? 
Comment 26 Zhe Su 2005-07-06 03:25:42 UTC
I found that ld is smart enough to export those mandatory symbols no matter
whether they are listed in version script file. See the symbols list attached.

I don't know what's the matter with acroread. But I couldn't find any similar
issue by valgrind. acroread just doesn't work with scim gtkimmodule.

I think there maybe some conflict between scim gtkimmodule and some acroread
libraries.

Maybe acroread itself has such weak symbols issue, rather than scim gtkimmodule.
Comment 27 Zhe Su 2005-07-06 03:27:53 UTC
Created attachment 41213 [details]
Symbols which without version-script.

This is the original symbol list.
Comment 28 Zhe Su 2005-07-06 03:31:11 UTC
Created attachment 41214 [details]
Symbols which has version file applied.

It's the new symbol list.
Comment 29 Michael Matz 2005-07-06 13:49:10 UTC
I see.  Yes, this is the right way.  But now you fell into the trap I already 
anticipated.  For instance you do not hide the symbol 
 bool std::binary_search<unsigned short*, unsigned short> 
   (unsigned short*, unsigned short*, unsigned short const&) 
because it's unmangled name does not start with "std::".  Similar for other 
functions, like 
 void std::make_heap<__gnu_cxx::__normal_iterator 
    <scim::Pointer<scim::IMEngineFactoryBase>* .... 
This is because you don't have "local:*" at the end, so you are exporting 
all symbols for which you didn't say otherwise.  But if you had that at the 
end you would have also hidden the vtable and typeinfo symbols for scim::*. 
It would be better (but more work) if you explicitely listed some more 
symbols to hide, in their mangled form (you can correlate the mangled 
and unmangled form in the symtab dump, by saving one without, and one with 
c++filt). 
Comment 30 Zhe Su 2005-07-06 15:21:10 UTC
Ok I see. 
Putting individual symbols into version script is hard to maintain. Is there any
better way to solve this issue?
Comment 31 Zhe Su 2005-07-06 16:38:02 UTC
Hi, I finally found a script to hide all that WEAK symbols:

for libscim:

LIBSCIM_1.0 {
    global:
        extern "C++" {
            *scim::*;
        };

    local:
        *;
};

for im-scim:
IM_SCIM_1.0 {
    global:
        extern "C" {
            im_module_init;
            im_module_create;
            im_module_list;
            im_module_exit;
        };

    local:
        extern "C++" {
            __gnu_cxx::*;
            std::*;
            *std::*_S_construct*;
        };
};


But it still doesn't fix acroread.
The new symbol list and acroread valgrind result will be attached.
Comment 32 Zhe Su 2005-07-06 16:40:36 UTC
Created attachment 41260 [details]
New symbol list of libscim, which doesn't have all WEAK std symbols.
Comment 33 Zhe Su 2005-07-06 16:41:32 UTC
Created attachment 41261 [details]
New symbol list of im-scim, without all WEAK std symbols.
Comment 34 Zhe Su 2005-07-06 16:43:18 UTC
Created attachment 41262 [details]
Valgrind memcheck output of acroread on Snapshot2.

You see that it crashed at symbol:
std::locale::operator=(std::locale const&)
in acroread binary.
Comment 35 Mike Fabian 2005-07-27 13:14:46 UTC
I just added the configure option "--enable-ld-version-script"
to the scim package on STABLE:

-------------------------------------------------------------------
Wed Jul 27 14:42:37 CEST 2005 - mfabian@suse.de

- Bugzilla #85416: add configure option "--enable-ld-version-script"

-------------------------------------------------------------------

It doesn't help neither for RealPlayer nor for Acroread.
RealPlayer still crashes:

mfabian@shannon:~$ /usr/bin/realplay
Launching a SCIM daemon with Socket FrontEnd...
Loading simple Config module ...
Creating backend ...
/usr/bin/realplay: line 75: 25085 セグメンテーション違反です               (core dumped) $REALPLAYBIN "$@"
mfabian@shannon:~$ 

But when setting GTK_IM_MODULE=xim, RealPlayer starts.

Comment 36 Michael Matz 2005-07-27 14:56:19 UTC
Have you verified that the version script really is used? 
Comment 37 Zhe Su 2005-08-13 02:55:43 UTC
Mike, could you please verify this issue again? If Acroread or RealPlayer are
still not working, then I think we need add GTK_IM_MODULE=xim into the startup
scripts to workaround this issue.
Comment 38 Zhe Su 2005-08-13 02:56:43 UTC
Hope this issue could be resolved within beta2.
Comment 39 Mike Fabian 2005-08-17 09:54:49 UTC
RealPlayer works but Acroread still doesn't.

I think I have to add the workaround to the acroread start script.
Comment 40 Mike Fabian 2005-08-17 13:57:09 UTC
acroread package with the workaround according to comment #37
submitted to STABLE.

Can we close this bug as FIXED?
Comment 41 George Horlacher 2005-08-18 16:39:12 UTC
So I'm wondering about this fix.  Did you put the work around inside the bz2 
file in the install script?  Since Acroread is a binary only package I don't 
think we are allowed to do that, or if we are to fix this problem it is going 
to have to be well documented as I normally take the tar.gz of new versions 
from Adobe and bzip2 them up as is.  Now if I need to update a script as well 
I need to know more about this.  Otherwise can it be done in the spec file to 
avoid changing the source at all? 
 
I have a new version 7.0.1 from acroread with a security fix I want to check 
in right away... 
Comment 42 Mike Fabian 2005-08-18 16:51:06 UTC
George> So I'm wondering about this fix.  Did you put the work around inside
George> the bz2 file in the install script?

No.

George> Since Acroread is a binary only package I don't think we are allowed
George> to do that, or if we are to fix this problem it is going to have to be
George> well documented as I normally take the tar.gz of new versions from
George> Adobe and bzip2 them up as is.  Now if I need to update a script as
George> well I need to know more about this.  Otherwise can it be done in the
George> spec file to avoid changing the source at all?

I did it in the .spec file. See this part of the %install section
where I apply a patch to the start script:

    # Apply workaround for http://bugzilla.novell.com/show_bug.cgi?id=85416 :
    pushd $RPM_BUILD_ROOT/usr/X11R6/bin
	patch -p0 -i $RPM_SOURCE_DIR/bugzilla-85416.patch
    popd

George> I have a new version 7.0.1 from acroread with a security fix I
George> want to check in right away...

Go ahead and check it in!

Just adapt the patch if necessary if it doesn't apply anymore.
Comment 43 George Horlacher 2005-08-18 17:39:38 UTC
Sounds good.  I had searched the spec file for: GTK_IM_MODULE and did not see 
it, so I thought the change must be in the actual install script.  I'll check 
in the update and next time examine the patches :) Thanks. 
Comment 44 Michael Matz 2005-08-19 04:16:14 UTC
I reassign this to you Mike, as it wasn't my error from the beginning :-) 
Comment 45 Mike Fabian 2005-08-19 10:11:06 UTC
I think we can close this bug as FIXED.

It isn't really "fixed" for acroread, but we have a workaround
which is "good enough" and nobody has any better idea currently.

→ FIXED.
Comment 46 Zhe Su 2005-12-23 14:51:49 UTC
Do you have any idea about this upstream bug? http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24660

I think it's the real reason of this scim compatibility issue. Is it possible for us to help them fix this bug?
Comment 47 Richard Biener 2006-01-02 10:01:03 UTC
I guess the only thing is testing the bits once they appear in gcc 4.2 svn.  I don't expect this to be fixed in 4.1, though.