Bug 1210176 - LLVM16 breaks Thunderbird 102 build
Summary: LLVM16 breaks Thunderbird 102 build
Status: NEW
Alias: None
Product: openSUSE Tumbleweed
Classification: openSUSE
Component: Development (show other bugs)
Version: Current
Hardware: Other Other
: P5 - None : Normal (vote)
Target Milestone: ---
Assignee: Wolfgang Rosenauer
QA Contact: E-mail List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-04-05 17:32 UTC by Wolfgang Rosenauer
Modified: 2024-06-20 12:39 UTC (History)
3 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Wolfgang Rosenauer 2023-04-05 17:32:12 UTC
Since LLVM16 landed on Tumbleweed Thunderbird fails to build.

https://build.opensuse.org/package/live_build_log/mozilla:Factory/MozillaThunderbird/openSUSE_Factory_pure/x86_64

As long as we don't find a solution to make TB compile again (which we might not be able to w/o upstream) there are no TB updates possible anymore in Tumbleweed.
Comment 1 Aaron Puchert 2023-04-06 01:14:15 UTC
Apparently the error is this:

[gecko-profiler 0.1.0] thread 'main' panicked at '"Vector_(unnamed_enum_at_/home/abuild/rpmbuild/BUILD/obj/dist/include/mozilla/Vector_h_457_3)" is not a valid Ident', /home/abuild/rpmbuild/BUILD/thunderbird-102.9.1/third_party/rust/proc-macro2/src/fallback.rs:701:9
[gecko-profiler 0.1.0] stack backtrace:
[gecko-profiler 0.1.0]    0:     0x55c6f1f32c29 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::ha742bd2bcfd69de2
[gecko-profiler 0.1.0]    1:     0x55c6f1f52b8e - core::fmt::write::hc9810906af1f037c
[gecko-profiler 0.1.0]    2:     0x55c6f1f19865 - std::io::Write::write_fmt::h10b9b1bf7f6d3548
[gecko-profiler 0.1.0]    3:     0x55c6f1f329e5 - std::sys_common::backtrace::print::hd17aca3f348a4a39
[gecko-profiler 0.1.0]    4:     0x55c6f1f2753f - std::panicking::default_hook::{{closure}}::h9d2f484743faf3b2
[gecko-profiler 0.1.0]    5:     0x55c6f1f271fa - std::panicking::default_hook::h74255b2fa3482707
[gecko-profiler 0.1.0]    6:     0x55c6f1f27b5f - std::panicking::rust_panic_with_hook::h9ec0cb84f1e9e6f4
[gecko-profiler 0.1.0]    7:     0x55c6f1f32f99 - std::panicking::begin_panic_handler::{{closure}}::hb5deede75910c516
[gecko-profiler 0.1.0]    8:     0x55c6f1f32d7c - std::sys_common::backtrace::__rust_end_short_backtrace::hc26a381b3924939a
[gecko-profiler 0.1.0]    9:     0x55c6f1f276f2 - rust_begin_unwind
[gecko-profiler 0.1.0]   10:     0x55c6f1d7b7b3 - core::panicking::panic_fmt::ha005c52a737c94d3
[gecko-profiler 0.1.0]   11:     0x55c6f1ef7e4b - proc_macro2::fallback::Ident::_new::h6345eb95de114581
[gecko-profiler 0.1.0]   12:     0x55c6f1efa8b7 - proc_macro2::Ident::new::hff9fc8295beb1946
[gecko-profiler 0.1.0]   13:     0x55c6f1d95494 - bindgen::ir::context::BindgenContext::rust_ident_raw::hce3a4e438bde0ce6
[gecko-profiler 0.1.0]   14:     0x55c6f1dbb078 - <bindgen::ir::enum_ty::Enum as bindgen::codegen::CodeGenerator>::codegen::ha874f32e08551d9a
[gecko-profiler 0.1.0]   15:     0x55c6f1dd6ca2 - <bindgen::ir::item::Item as bindgen::codegen::CodeGenerator>::codegen::h26a7550adeb6ee1b
[gecko-profiler 0.1.0]   16:     0x55c6f1de6b94 - bindgen::codegen::CodegenResult::inner::h0f57b537c4f8c12e
[gecko-profiler 0.1.0]   17:     0x55c6f1e0bac8 - <bindgen::ir::module::Module as bindgen::codegen::CodeGenerator>::codegen::h6031f1f1c07d1f44
[gecko-profiler 0.1.0]   18:     0x55c6f1dd6c7a - <bindgen::ir::item::Item as bindgen::codegen::CodeGenerator>::codegen::h26a7550adeb6ee1b
[gecko-profiler 0.1.0]   19:     0x55c6f1de6b94 - bindgen::codegen::CodegenResult::inner::h0f57b537c4f8c12e
[gecko-profiler 0.1.0]   20:     0x55c6f1e0bac8 - <bindgen::ir::module::Module as bindgen::codegen::CodeGenerator>::codegen::h6031f1f1c07d1f44
[gecko-profiler 0.1.0]   21:     0x55c6f1dd6c7a - <bindgen::ir::item::Item as bindgen::codegen::CodeGenerator>::codegen::h26a7550adeb6ee1b
[gecko-profiler 0.1.0]   22:     0x55c6f1d98ba1 - bindgen::ir::context::BindgenContext::gen::h210cc538db9f7a2d
[gecko-profiler 0.1.0]   23:     0x55c6f1dedc32 - bindgen::Builder::generate::hc212441e514013fb
[gecko-profiler 0.1.0]   24:     0x55c6f1d7e965 - build_script_build::main::hc28434ae2471b69a
[gecko-profiler 0.1.0]   25:     0x55c6f1d7d6e3 - std::sys_common::backtrace::__rust_begin_short_backtrace::hceb9da78b083249d
[gecko-profiler 0.1.0]   26:     0x55c6f1d7f369 - std::rt::lang_start::{{closure}}::hab45c0cb7c53a1df
[gecko-profiler 0.1.0]   27:     0x55c6f1f18af4 - std::rt::lang_start_internal::hc63b3f3a0e5d3e5c
[gecko-profiler 0.1.0]   28:     0x55c6f1d7ee85 - main
[gecko-profiler 0.1.0]   29:     0x7fe98111abb0 - __libc_start_call_main
[gecko-profiler 0.1.0]   30:     0x7fe98111ac79 - __libc_start_main_alias_2
[gecko-profiler 0.1.0]   31:     0x55c6f1d7bc55 - _start
[gecko-profiler 0.1.0]                                at /home/abuild/rpmbuild/BUILD/glibc-2.37/csu/../sysdeps/x86_64/start.S:115
[gecko-profiler 0.1.0]   32:                0x0 - <unknown>
   Compiling darling v0.13.4
error: failed to run custom build command for `gecko-profiler v0.1.0 (/home/abuild/rpmbuild/BUILD/thunderbird-102.9.1/tools/profiler/rust-api)`

So it's in some Rust code. But according to the log we're still using rust1.67, which should be using LLVM 15, not 16. [1]

We're also installing LLVM 16 packages though. Of course through Mesa, but apparently we're using some more for bindgen:

checking for clang for bindgen... /usr/bin/clang++
checking for libclang for bindgen... /usr/lib64/libclang.so
checking that libclang is new enough... yes

Since we're seeing bindgen in the stack above, that's probably the issue. My guess is that libclang spits out identifiers that bindgen can no longer handle, something like "Vector::(unnamed enum at /home/abuild/rpmbuild/BUILD/obj/dist/include/mozilla/Vector.h:457:3)". Some characters have been translated into underscores, but not all ('/' is missing). Which turns it into an invalid identifier. Perhaps libclang would have earlier used just the filename, but now prints the full path.

I think this needs to be fixed in rust-bindgen. I quick search shows a couple of similar issues [2,3], pointing among others to a change in Clang [4]. But none of them is an exact match.

As a temporary workaround, you might try to change

BuildRequires: clang-devel

into

BuildRequires: clang15-devel
BuildRequires: llvm15-libclang13

which should hopefully not make the build unresolvable, but prefer the older version of libclang13.

[1] https://build.opensuse.org/package/view_file/openSUSE:Factory/rust1.67/rust1.67.spec
[2] https://github.com/rust-lang/rust-bindgen/issues/2312
[3] https://github.com/rust-lang/rust-bindgen/issues/2437
[4] https://github.com/llvm/llvm-project/commit/19e984ef8f49bc3ccced15621989fa9703b2cd5b
Comment 2 Wolfgang Rosenauer 2023-04-06 07:05:11 UTC
Trying to build using an older llvm version is something which has been done but still failed since llvm16 is pulled in no matter and is being used. I think it was a dependency that something requires clang-tools and clang-tools requires llvm16. (Or something like that)

fvogt pointed me to this patch:
https://github.com/rust-lang/rust-bindgen/pull/2319

After I added it to the package the original issue is gone and TB builds for i586. But it fails on x86-64 now with

thread 'main' panicked at 'Not able to resolve vector element?: Continue', third_party/rust/bindgen/src/ir/ty.rs:1170:22
Comment 3 Wolfgang Rosenauer 2023-04-06 07:12:47 UTC
About 
> So it's in some Rust code. But according to the log we're still using 
> rust1.67, which should be using LLVM 15, not 16. [1]

Typically for Thunderbird (because it's somewhat ESR from upstream) we are not using the latest and greatest since that turned out to be very risky and often fails because the 102 codebase as of today is not tested with latest rust/llvm.

So we try to use what is outlined here:
https://firefox-source-docs.mozilla.org/writing-rust-code/update-policy.html
which would require rust 1.60. To not block Tumbleweed with very old versions I increased to 1.67 after I was being asked by the release team to try.
So much about the context why we we are selecting certain versions which would work from rust perspective BUT apparently when it comes to the combination rust/llvm this is not possible (see previous comment). Is there anything we can do about that?
Comment 4 Manfred Hollstein 2023-04-06 08:39:14 UTC
As Wolfgang pointed out, when clang-tools gets required, we always end in pulling in the whole llvm16 chain. The reason for that is, that clang-tools only exists in the very latest llvm16 package; older llvm15 packages don't build it anymore due to the following in the llvmXY packages:

%define _plv %{!?product_libs_llvm_ver:%{_sonum}}%{?product_libs_llvm_ver}

For Tumbleweed %product_libs_llvm_ver is set to 16 atm.

In principle all older llvm packages can be stripped off their -devel packages because they don't work anymore due to the missing clang-tools package with that particular version. It would be great if clang-tools would be put into a separate package, ideally even as a versioned one. Something like

%package -n clang%{_sonum}-tools
Provides: clang-tools

Then you can use whatever version of llvm/clang is needed for some older packages.
Comment 5 Wolfgang Rosenauer 2023-04-06 13:52:41 UTC
Meanwhile via pulling in different sets of upstream changes to rust bindgen it eventually builds again.
I still would like to discuss the underlying issue further in this bug as it's a major pain and can happen almost anytime again.
Comment 6 Aaron Puchert 2023-04-06 20:16:36 UTC
(In reply to Wolfgang Rosenauer from comment #2)
> Trying to build using an older llvm version is something which has been done
> but still failed since llvm16 is pulled in no matter and is being used. I
> think it was a dependency that something requires clang-tools and
> clang-tools requires llvm16.

That's true, but:
* It shouldn't draw in llvm16-devel or clang16-devel, and it seems like libclang is the relevant package here. (Unless they are parsing the output of clang, which sounds adventurous.)
* You can still use the versioned executables, i.e. clang-15 instead of clang, either by setting CXX or by patching the build.

Please note that you need to require llvm15-libclang13, otherwise it will use libclang13 = 16.0.0, which will behave like Clang 16. Just requiring llvm15-devel is not enough.

> fvogt pointed me to this patch:
> https://github.com/rust-lang/rust-bindgen/pull/2319

That looks about right, but maybe it doesn't apply clean logically.

(In reply to Wolfgang Rosenauer from comment #3)
> So much about the context why we we are selecting certain versions which
> would work from rust perspective BUT apparently when it comes to the
> combination rust/llvm this is not possible (see previous comment). Is there
> anything we can do about that?

As far as I know, the latest Rust versions have always been pinned to the LLVM version that Rust upstream uses for their builds. Also, I don't think Rust/LLVM is the problem here, it doesn't look like a miscompilation to me. It's that libclang spits out identifiers for anonymous types that previously were empty.

(In reply to Manfred Hollstein from comment #4)
> In principle all older llvm packages can be stripped off their -devel
> packages because they don't work anymore due to the missing clang-tools
> package with that particular version.

How do you come to that conclusion? There are multiple packages building against older {clang,llvm}X-devel versions. The binaries in clang-tools are typically not used for builds, and should be able to exist in a newer version. They're not tied to LLVM IR but pure frontend tools.

If these packages didn't work, there would be little reason for us to keep maintaining older LLVM versions at all. After all, few people are interested in using an older compiler. The main use is other packages that still use the older API and haven't been migrated yet.

> It would be great if clang-tools would be put into a separate package,
> ideally even as a versioned one. Something like
> 
> %package -n clang%{_sonum}-tools
> Provides: clang-tools

The package is deliberately unversioned, because multiple versions of it can not be installed in parallel, and should not be relevant for builds to be able to install an older version. It also shouldn't have anything to do with our situation. It draws in clang16, but it doesn't block clang15-devel, and if for some reason you really need to use clang15 for compilation, for example because you're going to emit LLVM IR, you can simply use the versioned executables.
Comment 7 William Brown 2023-04-11 03:00:54 UTC
> So it's in some Rust code. But according to the log we're still using rust1.67, which should be using LLVM 15, not 16. [1]

Rust already pins an llvm version internally.

 24 %global llvm_version 15

321 BuildRequires:  llvm%{llvm_version}-devel

So I think in this case, the tb spec should adjust clang to clang15-devel.

I'm not sure if there is a way to "propogate" this though so that when we update rust it also pairs the same clang/llvm - something like a rust-llvm package that requires on the correct llvm version.
Comment 8 Aaron Puchert 2023-04-11 21:40:56 UTC
(In reply to William Brown from comment #7)
> So I think in this case, the tb spec should adjust clang to clang15-devel.
> 
> I'm not sure if there is a way to "propogate" this though so that when we
> update rust it also pairs the same clang/llvm - something like a rust-llvm
> package that requires on the correct llvm version.

The LLVM used by Rust and the libclang used by rust-bindgen are not really connected: one is used as backend for the Rust compiler, the other as a library in rust-bindgen.

It could be that Rust upstream is only testing with the same Clang version that they use for LLVM, but that doesn't help us here at runtime: the libclang.so from major LLVM versions can't coexist, unlike libLLVM.so. That's because the former has an ABI stability guarantee across major versions, and doesn't increase the SO version anymore, while the latter has no such guarantee and gets a new SO version with every major release. So you can't pin to a certain version of libclang.so at runtime.

As a workaround for building Thunderbird however, it should be fine, as long as no one needs the newer libclang.so.

(In reply to Aaron Puchert from comment #6)
> Please note that you need to require llvm15-libclang13, otherwise it will
> use libclang13 = 16.0.0, which will behave like Clang 16. Just requiring
> llvm15-devel is not enough.

That's currently not possible, as clang15-devel requires clang15, which requires clang-tools currently coming from llvm16, which requires clang16, which requires 'libclang.so.13(LLVM_16)(64bit)' via c-index-test, which is only satisfied by libclang13 from LLVM 16.

The weakest link is probably clang-tools -> clang16. The scripts are largely version-agnostic and should be able to deal with older compiler versions. I'll see if I can cut that.
Comment 9 OBSbugzilla Bot 2023-04-22 12:35:05 UTC
This is an autogenerated message for OBS integration:
This bug (1210176) was mentioned in
https://build.opensuse.org/request/show/1082181 Factory / llvm16
Comment 10 Aaron Puchert 2023-04-22 14:55:38 UTC
(In reply to OBSbugzilla Bot from comment #9)
> This is an autogenerated message for OBS integration:
> This bug (1210176) was mentioned in
> https://build.opensuse.org/request/show/1082181 Factory / llvm16

With that it should be possible to build with clang15-devel + llvm15-libclang13.
Comment 13 OBSbugzilla Bot 2024-03-12 09:55:42 UTC
This is an autogenerated message for OBS integration:
This bug (1210176) was mentioned in
https://build.opensuse.org/request/show/1157115 Backports:SLE-15-SP5 / llvm17