Bug 76401

Summary: Package manager causes segfault with GCC4
Product: [openSUSE] SUSE LINUX 10.0 Reporter: Jiri Srain <jsrain>
Component: YaST2Assignee: Michael Andres <ma>
Status: RESOLVED FIXED QA Contact: Klaus Kämpf <kkaempf>
Severity: Enhancement    
Priority: P5 - None CC: matz, mvidner
Version: unspecified   
Target Milestone: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Found By: Development Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: Backtrace got when running under GDB

Description Jiri Srain 2005-04-08 11:06:10 UTC
According to the backtrace if run under GDB (otherwise it is unnusable) it 
happens while static objects are constructed. That's probably because the 
order in which objects are contstructed has changed (as it is unspecified). 
 
 
To get backtrace:  
- start building yast2-printer  
- when dbbuild segfaults, chroot to buildroot 
- cd /usr/src/packages/BUILD/yast2-printer/agent-ppd/database 
- run under gdb .libs/dbbuild 
 
Attaching my backtrace
Comment 1 Jiri Srain 2005-04-08 11:08:13 UTC
Created attachment 33454 [details]
Backtrace got when running under GDB
Comment 2 Jiri Srain 2005-04-08 11:09:00 UTC
Forgot to say: As workaround, the perl-bindings can be uninstalled (which 
fixed build, but probably this problem affects running system as well). 
Comment 3 Michael Matz 2005-04-11 15:30:07 UTC
CCing Paolo, as I darkly remember an issue with the allocators and 
constructing/destructing the pools.  What I remember was fixed, but perhaps 
there's still something rotten.  That doesn't mean that GCC is at fault yet. 
It's still possible that the source does something undefined.  A testcase 
would be needed. 
Comment 4 Stanislav Visnovsky 2005-04-12 13:58:59 UTC
This is a blocker for installation. Chrooted-SCR in inst_finish segfaults    
because of this. The 1st stage installation dlopens y2pm, so it's running 
fine, but SCR in chroot is directly linked to y2pm. 
    
This is the offending code:    
const std::vector<TagDescr> Mtags::_tagvec( Mtags::init() );    
    
where Mtags::init is defined as:    
    
    static std::vector<TagDescr> init() {    
      std::vector<TagDescr> tagvec;    
      tagvec.resize( NUMTAGS );    
      for ( unsigned i = 0; i < NUMTAGS; ++i ) {    
        switch ( (tags)i ) {    
#define Mstag(t,v,m) case t: tagvec[i] = TagDescr(v,m); break    
          Mstag(RLC,    "Rlc",  true);    
          Mstag(ON_S,   "Ons",  true);    
          Mstag(OFF_S,  "Offs", true);    
          Mstag(ON_P,   "Onp",  true);    
          Mstag(ON_TP,  "Ontp", true);    
          Mstag(OFF_P,  "Offp", true);    
          Mstag(OFF_TP, "Offtp",true);    
          // no default:    
          case NUMTAGS: break;    
        }    
      }    
      return tagvec;    
     }   
 
Any clue if it's gcc or y2pm fault? 
Comment 5 Michael Matz 2005-04-12 20:38:52 UTC
But the backtrace from Jiri contains an dlopen, and then the segfault. 
Comment #4 seems to say that the one using dlopen runs fine (stage1), but the 
linked one breaks. 
Comment 6 Stanislav Visnovsky 2005-04-12 21:44:48 UTC
Comment #5: you are right, both cases are dlopened. 
Comment 7 Michael Matz 2005-04-12 23:39:56 UTC
Argh.  Why does yast has to use so many static objects?  Do you actually know that 
some of them in fact load other libraries?  This means that before main is even started, 
there are many functions run, some of them load other shared libs, which in turn 
contain functions loading perhaps even more shared libs.  That's totally undebuggable. 
I higly suggest to _really_ think hard about a redesign for this.  Right now.  Before preview 
phase. 
 
Beware, the following is long, so here a small overview how it goes wrong: 
 
liby2.so (or libscr.so) is loaded 
  its static ctors run --> need libstdc++ capabilities --> allocators are set up once (single 
    threaded) 
libpy2lang_perl.so is loaded 
libperl.so is loaded 
libpthread.so is loaded (BAD, too late) 
liby2pm.so is loaded 
  its static ctors run 
  need vector 
  access allocators, but multi-threading wise --> crash 
 
Anyway, what here happens is this: libstdc++ (the allocators in fact) support multiple threads. 
At runtime it's chosen if or if not the threading support should be used.  The way by which 
this is done is to see if the symbol 'pthread_cancel' exists.  This is a symbol provided by 
libpthread.so.  So, if the binary is linked with libpthread, threading support is activated. 
 
One of the things done in libstdc++ is to initialize it's internal allocator structures.  This is done 
in two different ways depending on if the app was linked with libpthread or not.  Later, when 
those allocators are running, the helper functions also do their work depending on if libpthread 
is there or not. 
 
Problems now start when libpthread is only dlopen'ed.  This is actually what happens here. 
It is needed by /usr/lib/perl5/5.8.6/i586-linux-thread-multi/CORE/libperl.so, which is dlopen'ed 
by libpy2lang_perl.so if it's there.  That libperl.so is the only DSO requiring libpthread. 
 
So, what happens is, that first some y2 libs are loaded and initializers run.  libperl.so is not 
yet loaded.  liby2.so has some initializers which in turn need the libstdc++ allocators, ergo 
their initializers are run (much more sanely only when needed, instead of per static objects). 
libpthread is not loaded, so the allocators are only set up to handle single threading. 
 
Then it goes on, and somewhen dlopens libperl.so, which in turn loads libpthread.so.  So now 
it's available.  And _then_ it loads liby2pm.so, which contains more initializers needing 
the libstdc++ allocators (a vector in this case).  The structures are initialized already to 
support single threading.  They are not initialized again (there's no reason, as "threading 
is supported" can't change at runtime).  So, now the accessor functions act as if threading 
was on (because libpthread meanwhile is loaded), but the members to support this 
(for instance a per-thread array, which is not allocated in the single threading case) are not 
set up.  That's why it segfaults. 
 
All in all, you can't dlopen libpthread.  Never.  Not supported.  Can't be made so. 
 
The only solution for you is, to link all apps against libpthread, which possibly later load it 
thought libperl.so, or anything else.  I guess this means simply all Yast2 apps. 
 
And once again: redesign the initialization sequence to not need static objects, except 
_perhaps_ very simple one, like just a variable turning from zero to one, or so.  But nothing 
complex like calling functions, _especially_ not external functions.  And definitely not dlopen. 
 
Comment 8 Michael Matz 2005-04-12 23:43:52 UTC
Btw. I've verified that linking dbbuild against libpthread (use -pthread as linker and 
compile switch) fixes the segfault. 
Comment 9 Klaus Kämpf 2005-04-13 06:35:37 UTC
Michael, thanks for your investigations. 
 
Can you, from your debugging efforts, be more specific about which static 
objects you found ? 
Comment 10 Michael Matz 2005-04-13 14:49:27 UTC
The one ultimately resulting in the segfault is the one from comment #4 in 
PMPackageImEx.cc:726: 
  const std::vector<TagDescr> Mtags::_tagvec( Mtags::init() ); 
 
The one loading that library is connected with this part of the backtrace: 
#22 0x403cdde1 in dlopen () from /lib/libdl.so.2 
#23 0x40089f8d in Y2LanguageLoader::Y2LanguageLoader () 
   from /usr/lib/liby2.so.2 
#24 0x4008a742 in Y2LanguageLoader::Y2LanguageLoader () 
   from /usr/lib/liby2.so.2 
#25 0x40096406 in UstringHash::UstringHash () from /usr/lib/liby2.so.2 
#26 0x400805c5 in _init () from /usr/lib/liby2.so.2 
 
So, it's somewhere in liby2.so.2, and in fact some object of type UstringHash 
obviously.  (Why the ctor of some string can load other languages is unknown 
to me ;-) ).  But I can't locate it without more effort (would have to setup 
a buildsystem for liby2.so).  This also loads libpy2lang_perl.so ultimately. 
Comment 11 Klaus Kämpf 2005-04-13 19:40:24 UTC
Thanks ! 
 
Michael (Andres), this is for you now ;-) 
Comment 12 Martin Vidner 2005-04-14 13:07:04 UTC
Stano has already added -lpthread, a linker flag, to liby2. Do we need any
compile flags as well, matz? It seems -pthread is available only for some
architectures.

Anyway, cleaning up static objects is good.
Comment 13 Michael Matz 2005-04-14 14:28:05 UTC
Which architectures don't support -pthread?  It should work for all of them. 
Adding libpthread just to liby2, might be enough, if that library is 
loaded _before_ any STL datastructures are used anywhere (in static ctors 
for example).  Actually I'm not sure if this might just work by luck, though, 
as normally the application itself must be linked against libpthread. 
 
The the safest mean would be to link the apps against libpthread.  I think 
yast2 has a buildsystem where it should be easy to add -pthread to the general 
LDFLAGS (or similar). 
Comment 14 Martin Vidner 2005-05-18 08:18:28 UTC
'info gcc' documents -pthread only for ia64, powerpc and sparc. 
Comment 15 Paolo Carlini 2005-05-18 08:24:34 UTC
A simple (very late, sorry) reply to Michael comment #3: currently, since the
official release of 4.0.0, we use by default the trivial new-based allocator,
exactly the same used in 3.4.x, for binary compatibility reasons. Really, 
*currently* everything can cause problem, but the allocator ;)
Comment 16 Michael Andres 2005-06-08 09:50:41 UTC
No longer blocker, but reminder to clean up static objects.
Comment 17 Michael Andres 2005-08-18 13:31:22 UTC
works