Bugzilla – Bug 1212698
gcc-c++-13 LTO introduces non-determinism
Last modified: 2023-10-10 15:18:41 UTC
While working on reproducible builds for openSUSE, I found that our photoqt package varies between -j1 and -j2 builds since at least 2023-04-06 but only if LTO is enabled. The diff looks thus: --- hexdump -C RPMS.1/usr/bin/photoqt +++ hexdump -C RPMS.2/usr/bin/photoqt @@ -54,8 +54,8 @@ 00000350 01 00 00 00 00 00 00 00 01 00 01 c0 04 00 00 00 |................| 00000360 09 00 00 00 00 00 00 00 02 00 01 c0 04 00 00 00 |................| 00000370 01 00 00 00 00 00 00 00 04 00 00 00 14 00 00 00 |................| -00000380 03 00 00 00 47 4e 55 00 66 1b d2 9c f6 1c db 53 |....GNU.f......S| -00000390 37 de a6 7c 4a b1 00 6e d3 87 9e b3 04 00 00 00 |7..|J..n........| +00000380 03 00 00 00 47 4e 55 00 1c 48 9e e0 4c cb 55 03 |....GNU..H..L.U.| +00000390 0f ee 30 57 67 00 ca cb a5 db 33 43 04 00 00 00 |..0Wg.....3C....| 000003a0 10 00 00 00 01 00 00 00 47 4e 55 00 00 00 00 00 |........GNU.....| 000003b0 03 00 00 00 02 00 00 00 00 00 00 00 00 00 00 00 |................| 000003c0 07 04 00 00 21 04 00 00 41 03 00 00 00 00 00 00 |....!...A.......|
do you still have the two binaries and can you attach them?
I can't reproduce after editing the .spec file, replacing the make line with make -j1 and make -j2 the binaries are identical.
The difference is with osc build -j1 and -j2 : so 1-core-VM vs 2-core-VM which influences scheduling of forked processes / threads. You might be able to replicate it with taskset 1 vs 3. Here is the full reproducer script: #!/bin/sh osc co openSUSE:Factory/photoqt && cd $_ for N in 1 2 ; do osc build --vm-type=kvm --noservice -j $N --keep-pkg=RPMS.$N unrpm RPMS.$N/photoqt-*.x86_64.rpm hexdump -C usr/bin/photoqt > $N.strings done diff -u {1,2}.strings and I noticed more diff: 00548030 08 6a 54 00 00 00 00 00 00 00 00 00 00 00 00 00 |.jT.............| 00548040 70 68 6f 74 6f 71 74 2e 64 65 62 75 67 00 00 00 |photoqt.debug...| -00548050 76 67 8b 32 00 2e 73 68 73 74 72 74 61 62 00 2e |vg.2..shstrtab..| +00548050 cf 99 c4 a7 00 2e 73 68 73 74 72 74 61 62 00 2e |......shstrtab..| 00548060 69 6e 74 65 72 70 00 2e 6e 6f 74 65 2e 67 6e 75 |interp..note.gnu| 00548070 2e 70 72 6f 70 65 72 74 79 00 2e 6e 6f 74 65 2e |.property..note.| 00548080 67 6e 75 2e 62 75 69 6c 64 2d 69 64 00 2e 6e 6f |gnu.build-id..no|
Interestingly, with the recent update to photoqt-3.3, I can no longer reproduce this issue.
Based on comment#4, I close this bug report, please feel free to reopen it whenever necessary, thanks.
I found a similar case now with python-mpi4py that would produce variations in debuginfo unless both builds were performed in a 1-core-VM or lto was disabled. call trace looks thus: /home/abuild/rpmbuild/BUILD/mpi4py-3.1.4/build/lib.linux-x86_64-cpython-39/mpi4py/MPI.cpython-39-x86_64-linux-gnu.so written per open by pid=1951 dir=/home/abuild/rpmbuild/BUILD/mpi4py-3.1.4 exec="/usr/lib64/gcc/x86_64-suse-linux/13/../../../../x86_64-suse-linux/bin/ld", ["/usr/lib64/gcc/x86_64-suse-linux/13/../../../../x86_64-suse-linux/bin/ld", "-plugin", "/usr/lib64/gcc/x86_64-suse-linux/13/liblto_plugin.so", "-plugin-opt=/usr/lib64/gcc/x86_64-suse-linux/13/lto-wrapper", "-plugin-opt=-fresolution=/tmp/cce7PdEe.res", "-plugin-opt=-pass-through=-lgcc", "-plugin-opt=-pass-through=-lgcc_s", "-plugin-opt=-pass-through=-lc", "-plugin-opt=-pass-through=-lgcc", "-plugin-opt=-pass-through=-lgcc_s", "--build-id", "--eh-frame-hdr", "-m", "elf_x86_64", "-shared", "-o", "build/lib.linux-x86_64-cpython-39/mpi4py/MPI.cpython-39-x86_64-linux-gnu.so", "/usr/lib64/gcc/x86_64-suse-linux/13/../../../../lib64/crti.o", "/usr/lib64/gcc/x86_64-suse-linux/13/crtbeginS.o", "-Lbuild/temp.linux-x86_64-cpython-39", "-L/usr/lib64/mpi/gcc/openmpi4/lib64", "-L/usr/lib64/gcc/x86_64-suse-linux/13", "-L/usr/lib64/gcc/x86_64-suse-linux/13/../../../../lib64", "-L/lib/../lib64", "-L/usr/lib/../lib64", "-L/usr/lib64/gcc/x86_64-suse-linux/13/../../../../x86_64-suse-linux/lib", "-L/usr/lib64/gcc/x86_64-suse-linux/13/../../..", "build/temp.linux-x86_64-cpython-39/src/MPI.o", "-ldl", "-lmpi", "-lgcc", "--push-state", "--as-needed", "-lgcc_s", "--pop-state", "-lc", "-lgcc", "--push-state", "--as-needed", "-lgcc_s", "--pop-state", "/usr/lib64/gcc/x86_64-suse-linux/13/crtendS.o", "/usr/lib64/gcc/x86_64-suse-linux/13/../../../../lib64/crtn.o"] - started by pid=1950 dir=/home/abuild/rpmbuild/BUILD/mpi4py-3.1.4 exec="/usr/lib64/gcc/x86_64-suse-linux/13/collect2", ["/usr/lib64/gcc/x86_64-suse-linux/13/collect2", "-plugin", "/usr/lib64/gcc/x86_64-suse-linux/13/liblto_plugin.so", "-plugin-opt=/usr/lib64/gcc/x86_64-suse-linux/13/lto-wrapper", "-plugin-opt=-fresolution=/tmp/cce7PdEe.res", "-plugin-opt=-pass-through=-lgcc", "-plugin-opt=-pass-through=-lgcc_s", "-plugin-opt=-pass-through=-lc", "-plugin-opt=-pass-through=-lgcc", "-plugin-opt=-pass-through=-lgcc_s", "-flto=auto", "--build-id", "--eh-frame-hdr", "-m", "elf_x86_64", "-shared", "-o", "build/lib.linux-x86_64-cpython-39/mpi4py/MPI.cpython-39-x86_64-linux-gnu.so", "/usr/lib64/gcc/x86_64-suse-linux/13/../../../../lib64/crti.o", "/usr/lib64/gcc/x86_64-suse-linux/13/crtbeginS.o", "-Lbuild/temp.linux-x86_64-cpython-39", "-L/usr/lib64/mpi/gcc/openmpi4/lib64", "-L/usr/lib64/gcc/x86_64-suse-linux/13", "-L/usr/lib64/gcc/x86_64-suse-linux/13/../../../../lib64", "-L/lib/../lib64", "-L/usr/lib/../lib64", "-L/usr/lib64/gcc/x86_64-suse-linux/13/../../../../x86_64-suse-linux/lib", "-L/usr/lib64/gcc/x86_64-suse-linux/13/../../..", "build/temp.linux-x86_64-cpython-39/src/MPI.o", "-ldl", "-lmpi", "-lgcc", "--push-state", "--as-needed", "-lgcc_s", "--pop-state", "-lc", "-lgcc", "--push-state", "--as-needed", "-lgcc_s", "--pop-state", "/usr/lib64/gcc/x86_64-suse-linux/13/crtendS.o", "/usr/lib64/gcc/x86_64-suse-linux/13/../../../../lib64/crtn.o"] - started by pid=1949 dir=/home/abuild/rpmbuild/BUILD/mpi4py-3.1.4 exec="/usr/bin/gcc", ["/usr/bin/gcc", "-shared", "-O2", "-Wall", "-U_FORTIFY_SOURCE", "-D_FORTIFY_SOURCE=3", "-fstack-protector-strong", "-funwind-tables", "-fasynchronous-unwind-tables", "-fstack-clash-protection", "-Werror=return-type", "-flto=auto", "-fno-strict-aliasing", "build/temp.linux-x86_64-cpython-39/src/MPI.o", "-Lbuild/temp.linux-x86_64-cpython-39", "-ldl", "-o", "build/lib.linux-x86_64-cpython-39/mpi4py/MPI.cpython-39-x86_64-linux-gnu.so", "-I/usr/lib64/mpi/gcc/openmpi4/include", "-L/usr/lib64/mpi/gcc/openmpi4/lib64", "-lmpi"] - started by pid=1948 dir=/home/abuild/rpmbuild/BUILD/mpi4py-3.1.4 exec="/usr/lib64/mpi/gcc/openmpi4/bin/mpicc", ["/usr/lib64/mpi/gcc/openmpi4/bin/mpicc", "-shared", "-O2", "-Wall", "-U_FORTIFY_SOURCE", "-D_FORTIFY_SOURCE=3", "-fstack-protector-strong", "-funwind-tables", "-fasynchronous-unwind-tables", "-fstack-clash-protection", "-Werror=return-type", "-flto=auto", "-fno-strict-aliasing", "build/temp.linux-x86_64-cpython-39/src/MPI.o", "-Lbuild/temp.linux-x86_64-cpython-39", "-ldl", "-o", "build/lib.linux-x86_64-cpython-39/mpi4py/MPI.cpython-39-x86_64-linux-gnu.so"] - started by pid=1577 dir=/home/abuild/rpmbuild/BUILD/mpi4py-3.1.4 exec="/usr/bin/python3.9", ["/usr/bin/python3.9", "setup.py", "build", "--executable=/usr/bin/python3.9 -s", "--force"]
If you run into this please attach the differing binaries.
Created attachment 868443 [details] binaries
I wonder if this is caused by dwz which also does some parallel processing. The differences are all in the line number program. When reproducing the issue is reliable, can you try with blocking dwz from the build system via #!BuildIgnore: dwz ?
reproduction is reliable. It seems, there is a constant result for every CPU-core-count, so two builds from a 4-core VM are identical, too. blocking dwz did not make a difference. I tried to make a smaller reproducer, but had no success there.