Bugzilla – Bug 66534
[SL 10.0] et_list handling of krb5 and libcom_err.so.2 conflict
Last modified: 2005-12-07 09:14:56 UTC
et_list of libcom_err.so.2 is handled statical while krb5 expects a dynamic handling. Andreas: Correct me if this is wrong. This leads to '*** glibc detected *** free(): invalid pointer: 0xb7f51d1c ***' while we do for example a net ads join.
First i try is to build krb5 with the included libcom_err
submit a patch to STABLE This patch fixes the double free, but we may have other problems if we use the system libcom_err Marius will write something in the next days.
The double-free problem, caused by mixing static and dynamic link nodes, is not the only one in com_err provided by the e2fsprogs package. Except the interface used by heimdal (working on a private et_list; per krb5context in case of heimdal), most of the other functions are not thread safe at all (are using static buffers). The patch submited to STABLE helps only for the cases, where programs calls the "initialize_*_error_table" functions provided by krb5 lib. As soon as a program linked against krb5 also uses own error tables and calls its own "initialize_*_error_table" function, we'll get the double-free problem again. The krb5 package provides an own, synchronized version of the com_err library, that unfortunatelly uses some krb5 functions in it and adds link dependency to a krb5 support library, causing that each program linking against this com_err also links against the krb5 support lib, The other way arround, is to fix the com_err library in e2fsprogs-devel, adding synchronization and differentiation between static and dynamic et_list nodes. This also adds a link dependency, but only to libpthread. I've tryed to do it (we've tested it with Michael and it seems to work fine). Please take a look on the following patches and comment.
Created attachment 29074 [details] adds mutex synchronization to e2fsprogs/lib/et
Created attachment 29075 [details] removed usage of a static buffer in error_message() The modified error_message() function may (depending on configuration) still make use of the strerror function, that is not thread safe....
Created attachment 29077 [details] disables init_error_table function The way, this function adds an dynamically allocaled error table to the et_list makes it impossible to free it propelly.
Created attachment 29079 [details] disables build of unused e2fsck.static Linking of e2fsck.static causes unresolved symbols to pthread_mutex_lock/unlock, since there is no -lpthread in LDFLAGS... Further it also disables the tests of compile_et. Alternatively, we can also patch all test cases...
set to enhancement for 10.0
Set to blocker for SL 10.0 . We have to provide a threadsafe version of libcom_err for 10.0/SLES10 I think we should do a split of this package ASAP. Now it is your turn, henne.
Raj: I think this bug is interesting for you.
so you want to split libcom_err from e2fsprogs and i should apply your patches. is that correct?
Yes. That's our suggestion. We should inform all package maintainers and discuss this in the dist team meeting, because some other packages have a #neededforbuild e2fsprogs(-devel) because they also need libcom_err If we split libcom_err in a seperate package, some other will not build anymore because they have to change the #neededforbuild
ok i have a package ready: libcom_err: /lib64/libcom_err.so.2 /lib64/libcom_err.so.2.1 libcom_err-devel: /usr/lib64/libcom_err.so /usr/lib64/pkgconfig/com_err.pc
I think there is something missing. /usr/share/et /usr/share/et/et_c.awk /usr/share/et/et_h.awk /usr/include/et /usr/include/et/com_err.h /usr/bin/compile_et
ok lets forget about libcom_err-devel then. Just one package with /lib64/libcom_err.so.2 /lib64/libcom_err.so.2.1 /usr/bin/compile_et /usr/include/et /usr/include/et/com_err.h /usr/lib64/libcom_err.so /usr/lib64/pkgconfig/com_err.pc /usr/share/et /usr/share/et/et_c.awk /usr/share/et/et_h.awk /usr/share/man/man1/compile_et.1.gz
Why forget about libcom_err-devel? I mean it makes sense to have: libcom_err: %_lib/libcom_err.so.2 %_lib/libcom_err.so.2.1 libcom_err-devel: %_lib/libcom_err.so %_lib/pkgconfig/com_err.pc /usr/bin/compile_et /usr/include/et /usr/include/et/com_err.h /usr/share/et /usr/share/et/et_c.awk /usr/share/et/et_h.awk /usr/share/man/man1/compile_et.1.gz BTW: An important point for the dist meeting is that libcom_err using my patches require -lpthread. This breaks at least the linking of the (unused?) e2fsck.static binary... but I mean it is not really a problem.
because its too small. it just blows up the package database for no reason....
Michael your turn..
Rudi: Before we checkin this, we schould talk about these changes. Maybe this should be a topic in the next dist meeting.
Many thanks to Hendrik for providing test-RPM for 9.3: http://w3.suse.de/~hvogel/krb5-libcom_err/
I think Rudi said in his mail check it in. So i throw this bug back to you.
checked in. so this is fixed.
reopen. with 1.38 of e2fsprogs its worse. mt can explain best.
Created attachment 41136 [details] updated mutex patch for e2fsprogs-1.38
Created attachment 41137 [details] updated e2fsck.static patch for e2fsprogs-1.38
Created attachment 41139 [details] updated init_error_table patch for e2fsprogs-1.38
Created attachment 41140 [details] updated no-static-buffer patch for e2fsprogs-1.38
Real changes are in attachment #41136 [details]. The new 1.38 version fixes the double-free problem in add_error_table and remove_error_table changing to use two list heads. Other way than in my patch but similar result. I've just adopted my patch here. But in the same breath it adds a double-free problem to the code that is generated in et_c.awk file, mixing dynamic allocated with statically linked nodes... It changes the initialize_<table-name>_error_table() function to call its _r variant with the global _et_list head and adds usage of the local (static) "link" node variable in the _r variant, if malloc fails there. As result, the _et_list head would contain dynamic allocated nodes, that will be never freed (except somebody does it manually) and also breaks software like heimdal adding static nodes to its local head - ok, only if malloc fails. I've changed et_c.awk to use the same way as in my previous patch: the initialize_<table-name>_error_table() adds nodes linked using the "link" node (static / module local), that does not need any free. Further I've removed the usage of "link" node from the _r variant, so software that uses _r variant can continue to use its cleanup functions freeing its local lists (e.g. using free_error_table(local_list)). The new patches have same behaviour as in our previous patches version, but using a different implementation of add/remove_error_table and the error_message functions.
I would say, it is fixed again...