|
Bugzilla – Full Text Bug Listing |
| Summary: | Upgrade problems with clang-cpp | ||
|---|---|---|---|
| Product: | [openSUSE] openSUSE Distribution | Reporter: | Franz Sirl <franz.sirl-obs> |
| Component: | Upgrade Problems | Assignee: | Richard Biener <rguenther> |
| Status: | CONFIRMED --- | QA Contact: | Jiri Srain <jsrain> |
| Severity: | Normal | ||
| Priority: | P5 - None | CC: | aaronpuchert, lubos.kocman |
| Version: | Leap 15.6 | ||
| Target Milestone: | --- | ||
| Hardware: | x86-64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Found By: | --- | Services Priority: | |
| Business Priority: | Blocker: | --- | |
| Marketing QA Status: | --- | IT Deployment: | --- |
I've also noticed this. Since the package is part of SLE, I can't do the update myself. Richard, can you port back https://build.opensuse.org/request/show/1130634? That's then required in all clang7, clang11, clang13 and clang15? In the SP2 codestream we also seem to have llvm9 and I can't see llvm13 so that's possibly only via the OBS sle-backports repository? I'll note I had issues with the Factory python3-clang conflicts/provides switching from %python3_sitearch to something else so I reverted that (but only for llvm17). The change for llvm15 seems to be just adding to clang_binfiles? The alternatives stuff there is obfuscated by macros. IMO the even older llvm might be better dropped on update? (In reply to Richard Biener from comment #3) > IMO the even older llvm might be better dropped on update? Dropping older llvm from SLE and/or backports would be fine with me. That way it is also obvious that there are no side effects when adding replacements via devel:tools:compiler. Or maybe move all older versions to backports after syncing them to devel:tools:compiler? (In reply to Richard Biener from comment #2) > That's then required in all clang7, clang11, clang13 and clang15? In the SP2 > codestream we also seem to have llvm9 and I can't see llvm13 so that's > possibly > only via the OBS sle-backports repository? Yes, llvm13 (and llvm14) is in Backports. I can take care of this. > I'll note I had issues with the Factory python3-clang conflicts/provides > switching from %python3_sitearch to something else so I reverted that > (but only for llvm17). Yes, that's fine. I mostly think about Factory when packaging and don't consider compatibility issues with Leap. This is one of those cases. (In reply to Richard Biener from comment #3) > The change for llvm15 seems to be just adding to clang_binfiles? The > alternatives stuff there is obfuscated by macros. Yes, adding to clang_binfiles and removing from the file list is the equivalent change. (https://build.opensuse.org/package/rdiff/devel:tools:compiler/llvm15?linkrev=base&rev=28) Obfuscation was not the intention. The binaries were duplicated a couple of times, and since almost every major update adds and removes binaries, and I always messed something up, I decided that having the list of binaries in one place is better. That's what you find in the beginning of the spec file plus a macro %lapply that applies another macro element-wise to a list. (Admittedly, the "list apply" is a bit of black magic, and I'd like to find a better way sometime.) As a nice side effect, this reduced the spec file by a couple hundred lines. (https://build.opensuse.org/request/show/995044) > IMO the even older llvm might be better dropped on update? Can't talk about the versions in SLE, but the versions in Backports are unfortunately still being used: $ osc whatdependson openSUSE:Backports:SLE-15-SP6 llvm13 standard x86_64 llvm13 : autofdo velociraptor velociraptor-client $ osc whatdependson openSUSE:Backports:SLE-15-SP6 llvm14 standard x86_64 llvm14 : gtkd ispc klee klee-uclibc ldc python3-pyside2 python3-pyside2:sle15_python_module tilix tvm Not sure to which extent these packages could switch to other versions. The klee maintainer has pinned to llvm14, although a wider range of versions should be supported. Maybe we can relax this. The issue with ldc in the past was bootstrapping, not sure to which extent that has been addressed. OK, so that would leave the possibility to "drop" llvm11 and older on the dist-upgrade - IIRC there's some magic that can make those packages to be deinstalled, but no idea how that's done or who should do this. I'll see to update llvm15 in SP5 and if you fix the backports repo copies of llvm13 and llvm14 we should be fine. (In reply to Aaron Puchert from comment #5) > (In reply to Richard Biener from comment #2) > > IMO the even older llvm might be better dropped on update? > > Can't talk about the versions in SLE, but the versions in Backports are > unfortunately still being used: > > $ osc whatdependson openSUSE:Backports:SLE-15-SP6 llvm13 standard x86_64 > llvm13 : > autofdo > velociraptor > velociraptor-client > $ osc whatdependson openSUSE:Backports:SLE-15-SP6 llvm14 standard x86_64 > llvm14 : > gtkd > ispc > klee > klee-uclibc > ldc > python3-pyside2 > python3-pyside2:sle15_python_module > tilix > tvm > > Not sure to which extent these packages could switch to other versions. The > klee maintainer has pinned to llvm14, although a wider range of versions > should be supported. Maybe we can relax this. The issue with ldc in the past > was bootstrapping, not sure to which extent that has been addressed. This pinning might be related to the divergence of the "clang" package between SLE and openSUSE. I remember it's quite hard to build a package requiring a newer clang on OBS. IIRC that's because requirements like "clang >= 9" don't work on SLE, but work in openSUSE. So in OBS the easy way out is to pin the version. No idea if the clang divergence is by design or an oversight. (In reply to Franz Sirl from comment #8) > This pinning might be related to the divergence of the "clang" package > between SLE and openSUSE. I remember it's quite hard to build a package > requiring a newer clang on OBS. IIRC that's because requirements like "clang > >= 9" don't work on SLE, but work in openSUSE. So in OBS the easy way out is > to pin the version. All of these packages are in Backports, where we have an up-to-date metapackage. (https://build.opensuse.org/package/show/openSUSE:Backports:SLE-15-SP6/llvm) I checked Klee in detail a few days ago and it's actually not that easy. It takes in additional sources that correspond to the LLVM version being used, which makes it quite a bit harder to build against other supported versions. > No idea if the clang divergence is by design or an oversight. The earlier SPs had updates for the metapackage, but there is some reason specific to how SLE is built that made it uncompelling to update it for later SPs. Richard wrote somewhere that he would like to remove it if possible. But I don't think there are a lot of LLVM users in SLE, I'm actually only aware of Mesa. (Though some BPF-related tools and Rust might use it as well. They usually want specific versions though.) Almost forgot: I filed requests for llvm13/14. * https://build.opensuse.org/request/show/1165716 * https://build.opensuse.org/request/show/1165718 Remaining conflicts in those requests should all come from SLE (https://build.opensuse.org/projects/home:repo-checker/packages/reports/files/openSUSE:Backports:SLE-15-SP6:Staging:adi:4): found conflict of clang11-11.0.1-150300.3.6.1.x86_64 with clang13-13.0.1-bp156.15.1.x86_64 /usr/bin/clang-cpp found conflict of clang11-11.0.1-150300.3.6.1.x86_64 with clang14-14.0.6-bp156.15.1.x86_64 /usr/bin/clang-cpp found conflict of clang13-13.0.1-bp156.15.1.x86_64 with clang15-15.0.7-150500.4.6.2.x86_64 /usr/bin/clang-cpp found conflict of clang13-13.0.1-bp156.15.1.x86_64 with clang5-5.0.1-8.5.1.x86_64 /usr/bin/clang-cpp found conflict of clang13-13.0.1-bp156.15.1.x86_64 with clang7-7.0.1-150100.3.22.2.x86_64 /usr/bin/clang-cpp found conflict of clang13-13.0.1-bp156.15.1.x86_64 with clang9-9.0.1-150200.3.6.1.x86_64 /usr/bin/clang-cpp found conflict of clang14-14.0.6-bp156.15.1.x86_64 with clang15-15.0.7-150500.4.6.2.x86_64 /usr/bin/clang-cpp found conflict of clang14-14.0.6-bp156.15.1.x86_64 with clang5-5.0.1-8.5.1.x86_64 /usr/bin/clang-cpp found conflict of clang14-14.0.6-bp156.15.1.x86_64 with clang7-7.0.1-150100.3.22.2.x86_64 /usr/bin/clang-cpp found conflict of clang14-14.0.6-bp156.15.1.x86_64 with clang9-9.0.1-150200.3.6.1.x86_64 /usr/bin/clang-cpp llvm5 should not be available in Leap, even llvm7 should for the most part blocked. There is a request to drop the llvm meta package from SP6 repositories as well (PED-7373), I hope this extends to Leap and we can instead pull the one from backports there. (In reply to Aaron Puchert from comment #9) > The earlier SPs had updates for the metapackage, but there is some reason > specific to how SLE is built that made it uncompelling to update it for > later SPs. Richard wrote somewhere that he would like to remove it if > possible. Well, that's a bit unfortunate for non-SUSE OBS users like me building newer/extended versions of SLE packages. If they build against for example SUSE:SLE-15-SP5:GA/standard , they need something like below in the spec-file: %if 0%{?is_opensuse} # openSUSE can use the meta package requirement BuildRequires: llvm-clang-devel >= 9.0.1 %else # for SLE currently we have to specify the exact package BuildRequires: clang9-devel %endif If they build against openSUSE:Backports:SLE-15-SP6/standard, they stumble over the SUSE_Backports_policy-SLE_conflict blocker that cannot be turned off globally (AFAIK) in the project config. To highlight the problem, for example I would like to enable libclang support for doxygen by default via a pull request (https://build.opensuse.org/repositories/home:fsirl:doxygen-libclang), but I don't want to put that maintenance burden vs SLE on the devel:tools maintainer. (In reply to Aaron Puchert from comment #9) > The earlier SPs had updates for the metapackage, but there is some reason > specific to how SLE is built that made it uncompelling to update it for > later SPs. Now I remember: the problem is that SLE doesn't rebuild packages unless there are updates to the sources. That's why the metapackage doesn't work in the expected way: updating the metapackage in Factory or Backports means that all packages using it are being rebuilt, but in SLE that's not the case. Packages continue to use the old LLVM version unless they get an update after the metapackage update, and then they will use the new version. In other words: using the metapackage on SLE is basically also just pinning to a specific version, but it's not hardcoded in the specfile. It depends on the build date. Because of that, we'd rather have packages in SLE make an explicit static choice, which doesn't functionally change anything, but makes it clear what is actually being used. (In reply to Richard Biener from comment #12) > There is a request to drop the llvm meta package from SP6 repositories as > well (PED-7373), I hope this extends to Leap and we can instead pull the > one from backports there. Leap's metapackage should shadow the one from SLE, so I don't expect any issues in Backports when removing SLE's metapackage. (In reply to Franz Sirl from comment #13) > Well, that's a bit unfortunate for non-SUSE OBS users like me building > newer/extended versions of SLE packages. SUSE OBS users have pretty much the same issue. For example, Mesa and related packages (like libclc) also pin LLVM versions for SLE. Another example is bpftrace: # Hard-code latest LLVM for SLES, the default version is too old %if 0%{?sle_version} == 150600 %define llvm_major_version 17 %else %if 0%{?sle_version} == 150500 %define llvm_major_version 15 %else %if 0%{?sle_version} == 150400 %define llvm_major_version 11 %endif %endif %endif > To highlight the problem, for example I would like to enable libclang > support for doxygen by default via a pull request > (https://build.opensuse.org/repositories/home:fsirl:doxygen-libclang), but I > don't want to put that maintenance burden vs SLE on the devel:tools > maintainer. Since devel:tools builds against 15.5 and 15.6, it should be easy to see breakage. And the newest LLVM version in SLE 15 SPx is basically always 2*x + 5, with the exception of SP4 where llvm13 didn't end up in SLE but only in Backports. But otherwise the release schedules have so far aligned perfectly. Ah, nice, thanks. I've updated my package.
Hmm, couldn't that be defined in rpm-config-SUSE? That way no real meta-package in SLE is needed and there would be still an easy way to require the best available LLVM for a specific SLE release if needed.
/etc/rpm/macros.d/macros.llvm:
# preferred LLVM major for this (eg. SP6) release
%define also_available_in_sle_llvm_major_version 17
package-prefers-best-llvm-in-SLE-SPx.spec:
BuildRequires: llvm%{?also_available_in_sle_llvm_major_version}-devel
package-prefers-best-clang-in-SLE-SPx.spec:
BuildRequires: clang%{?also_available_in_sle_llvm_major_version}-devel
Though not knowing how the SLES builds and rpm-config-SUSE really work, this still may cause unwanted rebuilds against current llvm/clang if llvm/clang from an earlier service pack gets updated? So no need to answer if you think this is unfeasible.
Given frequent API problems when using LLVM components I think you want to thoroughly test rebasing to a newer LLVM and thus any "silent" change there is unwanted. That's also why a BuildRequires: llvm-clang-devel >= 9.0.1 is really lying since there's no guarantee it will work with llvm version 42. Unless llvm changed course and promised stable APIs and backward compatibility of course. And now it seems I've been bitten by this API incompatibility myself. doxygen compiled on OBS SLES15SP5 against clang15-devel crashes when libclang13 on the destination machine is coming from llvm16 or later out of devel:tools:compiler. (In reply to Franz Sirl from comment #17) > And now it seems I've been bitten by this API incompatibility myself. > doxygen compiled on OBS SLES15SP5 against clang15-devel crashes when > libclang13 on the destination machine is coming from llvm16 or later out of > devel:tools:compiler. That sounds like an ABI issue though, libclang13 breaking the ABI. Might be worth filing a bug upstream for this. I looked into it a bit and I the crash looks liked this:
(gdb) bt
#0 __GI_raise (sig=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1 0x00007f7fcfd993e5 in __GI_abort () at abort.c:79
#2 0x00007f7fcfdddc27 in __libc_message (action=do_abort, fmt=0x7f7fcff070b8 "%s\n") at ../sysdeps/posix/libc_fatal.c:155
#3 0x00007f7fcfde5cca in malloc_printerr (str=0x7f7fcff04d8e "free(): invalid pointer") at malloc.c:5347
#4 0x00007f7fcfde7774 in _int_free (av=<optimized out>, p=<optimized out>, have_lock=0) at malloc.c:4173
#5 0x00007f7fccf64b69 in llvm::MallocAllocator::Deallocate (this=0x7f7fdac34128 <clang::format::FormatTokenLexer::CSharpAttributeTargets>, Ptr=0x2, Size=140728795303408, Alignment=8)
at ../include/llvm/Support/AllocatorBase.h:93
#6 llvm::StringMapEntry<std::nullopt_t>::Destroy<llvm::MallocAllocator> (this=0x2, allocator=...) at ../include/llvm/ADT/StringMapEntry.h:146
#7 llvm::StringMap<std::nullopt_t, llvm::MallocAllocator>::~StringMap (this=0x7f7fdac34128 <clang::format::FormatTokenLexer::CSharpAttributeTargets>) at ../include/llvm/ADT/StringMap.h:186
#8 0x00007f7fcfd9b1be in __cxa_finalize (d=0x7f7fcfd332f0) at cxa_finalize.c:83
#9 0x00007f7fcceb7fd3 in __do_global_dtors_aux () from /usr/lib64/libclang-cpp.so.16
#10 0x00007ffdf9da9380 in ?? ()
#11 0x00007f7fdad6d9b3 in _dl_fini () at dl-fini.c:138
Backtrace stopped: frame did not save the PC
Then I checked the loaded libraries:
ds1:~ # ldd /usr/bin/doxygen
linux-vdso.so.1 (0x00007ffed7fee000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fee53714000)
libclang.so.13 => /usr/lib64/libclang.so.13 (0x00007fee5365f000)
libclang-cpp.so.15 => /usr/lib64/libclang-cpp.so.15 (0x00007fee4f8c6000)
libLLVM.so.15 => /usr/lib64/libLLVM.so.15 (0x00007fee48d30000)
libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x00007fee48ade000)
libm.so.6 => /lib64/libm.so.6 (0x00007fee48990000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007fee48963000)
libc.so.6 => /lib64/libc.so.6 (0x00007fee4876c000)
/lib64/ld-linux-x86-64.so.2 (0x00007fee5377b000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007fee48767000)
libclang-cpp.so.16 => /usr/lib64/libclang-cpp.so.16 (0x00007fee445d7000)
libLLVM.so.16 => /usr/lib64/libLLVM.so.16 (0x00007fee3d1d8000)
libedit.so.0 => /usr/lib64/libedit.so.0 (0x00007fee3ce00000)
librt.so.1 => /lib64/librt.so.1 (0x00007fee3d1cc000)
libz.so.1 => /usr/lib64/libz.so.1 (0x00007fee3d1b3000)
libtinfo.so.6 => /lib64/libtinfo.so.6 (0x00007fee3d184000)
libxml2.so.2 => /usr/lib64/libxml2.so.2 (0x00007fee3cc95000)
liblzma.so.5 => /usr/lib64/liblzma.so.5 (0x00007fee3ca00000)
Wait? 2 different libclang-cpp and 2 different libLLVM will be loaded?
I'm not sure yet who is to blame, but my guess is that at least doxygen should only link directly against libclang-cpp, but not libclang and libLLVM too. I'll look into the doxygen cmake setup and continue investigating.
The specific problem is the global object clang::format::FormatTokenLexer::CSharpAttributeTargets from libclang-cpp has global constructors/destructors. The object seemingly is merged by the shared loader when 2 libclang-cpp with different versions are loaded. But the constructors/destructors are NOT merged, meaning that 2 constructors are run for the same object on library startup and 2 destructors are run on library shutdown. That is just asking for trouble it seems. It seems LLVM realized that and reverted to building with LLVM-soversion == libclang-soversion by default, but the SUSE llvm builds override that with -DCLANG_FORCE_MATCHING_LIBCLANG_SOVERSION:BOOL=OFF. Should I file a bug against the LLVM component? (In reply to Richard Biener from comment #16) > Unless llvm changed course and promised stable APIs and backward > compatibility of course. The C APIs (of LLVM and Clang) should be more or less stable, but the wider C++ APIs are not. (In reply to Franz Sirl from comment #19) > Wait? 2 different libclang-cpp and 2 different libLLVM will be loaded? That can happen and shouldn't be an issue. Both libclang-cpp and libLLVM have versioned symbols and should be able to coexist in different versions in the same process, unless I'm missing something. (In reply to Franz Sirl from comment #20) > The specific problem is the global object > clang::format::FormatTokenLexer::CSharpAttributeTargets from libclang-cpp > has global constructors/destructors. The object seemingly is merged by the > shared loader when 2 libclang-cpp with different versions are loaded. That would surprise me. I think that copy relocations allow the linking application to interpose the library variable with their own copy, but this should still be subject to symbol versioning. At least I'd hope so. > the constructors/destructors are NOT merged, meaning that 2 constructors are > run for the same object on library startup and 2 destructors are run on > library shutdown. That would be expected, the constructors and destructors are not visible to the linker as symbols, but simply as anonymous entries in .{init,fini}_array. > It seems LLVM realized that and reverted to building with LLVM-soversion == > libclang-soversion by default, but the SUSE llvm builds override that with > -DCLANG_FORCE_MATCHING_LIBCLANG_SOVERSION:BOOL=OFF. Should I file a bug > against the LLVM component? I'm not aware that this was the reason. What I've read from the posts is that some distributions were running into packaging issues because they have libclang and libclang-cpp in the same package. Or something like that. Having different versions of libclang-cpp or libLLVM in the same process is unfortunately hard to prevent once usage of the library gets more widespread. The API/ABI incompatibilities mean that different parts of the distribution will use different versions, some on the bleeding edge and others lagging behind. My understanding was that symbol versioning should have solved that. The only bug other than yours coming from a libclang incompatibility that I've seen since then was bug 1210176, but that was not an API/ABI incompatibility issue. Rather the user made assumptions about the naming of identifiers that were violated by the new version, but were actually never guaranteed. (In reply to Aaron Puchert from comment #21) [...] > (In reply to Franz Sirl from comment #20) > > The specific problem is the global object > > clang::format::FormatTokenLexer::CSharpAttributeTargets from libclang-cpp > > has global constructors/destructors. The object seemingly is merged by the > > shared loader when 2 libclang-cpp with different versions are loaded. > > That would surprise me. I think that copy relocations allow the linking > application to interpose the library variable with their own copy, but this > should still be subject to symbol versioning. At least I'd hope so. This should work for the symbols in the libraries themselves. I'm not sure how users go though since two versions are likely pulled in only indirectly, aka App X uses LLVM15 and library Y which uses LLVM13. Iff both App X and library Y instantiate a "LLVM typed object/template" then those instantiations will not have a symbol version and those could be subject to false interposition (usually COMDAT handling). The solution would be for library Y to better constrain what it exports of course, but that's not the rule in the C++ world unfortunately. (In reply to Richard Biener from comment #22) > (In reply to Aaron Puchert from comment #21) > [...] > > (In reply to Franz Sirl from comment #20) > > > The specific problem is the global object > > > clang::format::FormatTokenLexer::CSharpAttributeTargets from libclang-cpp > > > has global constructors/destructors. The object seemingly is merged by the > > > shared loader when 2 libclang-cpp with different versions are loaded. > > > > That would surprise me. I think that copy relocations allow the linking > > application to interpose the library variable with their own copy, but this > > should still be subject to symbol versioning. At least I'd hope so. > > This should work for the symbols in the libraries themselves. I'm not sure > how users go though since two versions are likely pulled in only indirectly, > aka App X uses LLVM15 and library Y which uses LLVM13. Iff both App X and > library Y instantiate a "LLVM typed object/template" then those > instantiations > will not have a symbol version and those could be subject to false > interposition (usually COMDAT handling). The solution would be for > library Y to better constrain what it exports of course, but that's not the > rule in the C++ world unfortunately. Oh, and "naming" the library differently doesn't help of course, the library SONAME is not part of instantiated objects. The problem here is really the C++ ODR. GNU C++ offers kind-of a workaround via ABI-tags, but I don't think they are widely used besides in GCCs C++ standard library. SUSE-RU-2024:2416-1: An update that has one fix can now be installed. Category: recommended (moderate) Bug References: 1221183 Maintenance Incident: [SUSE:Maintenance:33210](https://smelt.suse.de/incident/33210/) Sources used: SUSE Package Hub 15 15-SP5 (src): llvm15-15.0.7-150500.4.9.6 SUSE Package Hub 15 15-SP6 (src): llvm15-15.0.7-150500.4.9.6 openSUSE Leap 15.5 (src): llvm15-15.0.7-150500.4.9.6 openSUSE Leap 15.6 (src): llvm15-15.0.7-150500.4.9.6 SUSE Linux Enterprise Micro 5.5 (src): llvm15-15.0.7-150500.4.9.6 Basesystem Module 15-SP5 (src): llvm15-15.0.7-150500.4.9.6 Basesystem Module 15-SP6 (src): llvm15-15.0.7-150500.4.9.6 Development Tools Module 15-SP5 (src): llvm15-15.0.7-150500.4.9.6 NOTE: This line indicates an update has been released for the listed product(s). At times this might be only a partial fix. If you have questions please reach out to maintenance coordination. |
Hi, during a "zypper --releasever=15.6 dup" upgrade from a fully up-to-date 15.5 installation I got the following messages: Checking for file conflicts: ...................................................................................................................................................................[error] Detected 4 file conflicts: File /usr/bin/clang-cpp from install of clang17-17.0.6-150600.1.13.x86_64 (Main Repository) conflicts with file from install of clang13-13.0.1-bp156.7.42.x86_64 (Main Repository) File /usr/bin/clang-cpp from install of clang17-17.0.6-150600.1.13.x86_64 (Main Repository) conflicts with file from package clang11-11.0.1-150300.3.6.1.x86_64 (@System) File /usr/bin/clang-cpp from install of clang17-17.0.6-150600.1.13.x86_64 (Main Repository) conflicts with file from package clang15-15.0.7-150500.4.4.1.x86_64 (@System) File /usr/bin/clang-cpp from install of clang17-17.0.6-150600.1.13.x86_64 (Main Repository) conflicts with file from package clang7-7.0.1-150100.3.22.2.x86_64 (@System) File conflicts happen when two packages attempt to install files with the same name but different contents. If you continue, conflicting files will be replaced losing the previous content. Continue? [yes/no] (no): yes Even though this is mostly harmless, it disturbs the upgrade experience and probably can be easily fixed by backporting the clang-cpp/update-alternatives change from clang17 to the other clang versions that are part of Leap. regards, Franz