|
Bugzilla – Full Text Bug Listing |
| Summary: | Upgrade lapack to 3.12.0 | ||
|---|---|---|---|
| Product: | [openSUSE] openSUSE Tumbleweed | Reporter: | Atri Bhattacharya <badshah400> |
| Component: | Other | Assignee: | Atri Bhattacharya <badshah400> |
| Status: | RESOLVED FIXED | QA Contact: | E-mail List <qa-bugs> |
| Severity: | Normal | ||
| Priority: | P5 - None | CC: | eich, marcela.maslanova, mjambor, rguenther, stefan.bruens |
| Version: | Current | Flags: | eich:
needinfo?
(rguenther) |
| Target Milestone: | --- | ||
| Hardware: | Other | ||
| OS: | Other | ||
| See Also: | https://bugzilla.opensuse.org/show_bug.cgi?id=1225793 | ||
| Whiteboard: | |||
| Found By: | --- | Services Priority: | |
| Business Priority: | Blocker: | --- | |
| Marketing QA Status: | --- | IT Deployment: | --- |
| Bug Depends on: | 1207563 | ||
| Bug Blocks: | |||
|
Description
Atri Bhattacharya
2024-05-02 16:58:19 UTC
Egbert, Stefan, I am incubating the update here: https://build.opensuse.org/package/show/home:badshah400:lapack2023/lapack The recent versions use cmake, making life considerably easier, and I have started (nearly) from scratch. I shall restore some of the previous features 1/1 (like update-alternatives, or perhaps we should shift to libalternatives: discuss here) but there will also be differences, enforced by the different build system, simplicity of specfile, etc. Would be grateful if you let me know how it is looking. I can grant you (and others who you feel may be interested) user roles to the project if you are willing to join. Getting the updated package to near-feature-parity vis-à-vis the previous version would complete Step 1, in my opinion. Step 2 would be to collect all pkgs dependent on lapack from :Factory into this project and see what fails. Fix them. Step 3 would be to send sr for the lapack update first and then, if necessary, for each fixed package from Step 2. Step 4 (optional): Package blaspp [1] and lapackpp [2] and enable these features when configuring lapack build. [1] https://github.com/icl-utk-edu/blaspp [2] https://github.com/icl-utk-edu/lapackpp I had briefly played with this idea when I looked at Lapack last, by I find cmake hard to debug, so I didn't pursue it. I'm fine if you do the conversion and make sure that it works ;) Regarding update alternatives: They've given me a lot of grief. OBS staging is extremely picky about links that exist in one alternative but not the other. Unfortunately, you cannot run these tests yourself as there is not 'self-service' staging: you need to submit to Factory first and then let the staging managers check what breaks. The only reason I went into lapack was to fix issues with update alternatives with respect to the openblas implementation. Presently, things are working and I hope I don't have to touch these pieces ever again (OBS staging is extremely picky about links that exist in one package but not the other). Regarding switching to 'liblaternatives': a. AFAIKT, libalternatives uses a binary wrapper (`alts`), thus it is for binaries only, not for libraries (@Adam?) b. In case my assessment in a. was wrong: If we do this change, we'd have to change OpenBLAS as well. Since libalternatives is not in SLE-15, we would have to introduce it there if we were to update OpenBLAS. This would in turn require an update to lapack - and possibly a code stream-`fork` (as we may not be able to introduce all required updates to all supported SLE service packs). Things would become much simpler if we kept the update-alternatives in place for the time being and postpone this change to when we know that no new version of OpenBLAS will be required for SLE-15 any more. Even if we keep update-alternatives for the time being, for SLE (and Leap) we need to make sure that the way they are set up matches the present setting (names and location of links, which links we ship). This can be ensured by leaving update-alternatives in OpenBLAS untouched during this process and make sure we pass Factory staging. Many thanks for the feedback. (In reply to Egbert Eich from comment #2) > I had briefly played with this idea when I looked at Lapack last, by I find > cmake hard to debug, so I didn't pursue it. I'm fine if you do the > conversion and make sure that it works ;) Yes, the transition to cmake was one reason why I wanted to play it extra safe and test all packages dependent on lapack still build (and if they have %check sections, that they run) fine. And it looks ok so far (big batch of python-* packages to check but so far not issues building any of the packages — that were building against old lapack in science — in home:badshah400:lapack2023). I have also been testing put the update (installing, updating, removing, running tests) locally over the weekend. > Regarding update alternatives: > They've given me a lot of grief. OBS staging is extremely picky about links > that exist in one alternative but not the other. Unfortunately, you cannot > run these tests yourself as there is not 'self-service' staging: you need to > submit to Factory first and then let the staging managers check what breaks. > > The only reason I went into lapack was to fix issues with update > alternatives with respect to the openblas implementation. Presently, things > are working and I hope I don't have to touch these pieces ever again (OBS > staging is extremely picky about links that exist in one package but not the > other). > Yes, I think your update-alternatives fixes were perfect and they are preserved in the updated package without any change (except for using _%{_arch} and dropping the user-defined %{a_x} macro but that can be easily switched back). The thing that I have worked on as part of this is to try and understand the baselibs.conf magic and I think I now have the baselibs.conf file in a state that produces working -32bit builds for x86_64 (have no way to test any other archs myself). I basically replicate your update-alternatives scriptlets in the baselibs.conf file with appropriate changes to the syntax. Tests installing, removing, and upgrading packages involving -32bit dependencies are in progress locally on my machine (so far look good). We will have to similarly fix the baselibs.conf for OpenBLAS too. > Regarding switching to 'liblaternatives': > a. AFAIKT, libalternatives uses a binary wrapper (`alts`), thus it is for > binaries only, not for libraries (@Adam?) > b. In case my assessment in a. was wrong: > If we do this change, we'd have to change OpenBLAS as well. Since > libalternatives is not in SLE-15, we would have to introduce it there if > we > were to update OpenBLAS. This would in turn require an update to lapack - > and > possibly a code stream-`fork` (as we may not be able to introduce all > required > updates to all supported SLE service packs). > Things would become much simpler if we kept the update-alternatives in > place > for the time being and postpone this change to when we know that no new > version > of OpenBLAS will be required for SLE-15 any more. > > Even if we keep update-alternatives for the time being, for SLE (and Leap) > we need to make sure that the way they are set up matches the present > setting (names and location of links, which links we ship). This can be > ensured by leaving update-alternatives in OpenBLAS untouched during this > process and make sure we pass Factory staging. Yes, I agree using libalternatives is not an option at this (or maybe at all) point for lapack. Note one important step will be to make sure (at least for the shared libraries) that no symbols are removed. Ideally that should be the case for the static libraries as well - we've had multiple issues here in the past with lapack deprecating APIs and not building them by default when not specially asked. That might be OK for openSUSE but for old SLES codestreams we have to be careful. I'm not sure for SLE15 -> SLE16 compatibility but breaking the shared library ABI would be bad. Note that there have also been changes to how the multiflavor setup is organised: we now have a main flavour that builds the shared lib and devel packages and a static flavour that builds the static libs and man pages. This is necessitated by the requirement that the cmake build only allows either shared or static libs to be built but not both at the same time. And anyway, it also reads simpler. (In reply to Richard Biener from comment #4) > Note one important step will be to make sure (at least for the shared > libraries) that no symbols are removed. Ideally that should be the case for > the static libraries as well - we've had multiple issues here in the past > with lapack deprecating APIs and not building them by default when not specially asked. I thought exactly the same, so we build with `cmake... -DBUILD_DEPRECATED=ON` to build all deprecated symbols into the shared libs, but I will go over the cmake files to see if there is any additional flags to be enabled for something this does not already take care of. Thanks for the feedback. (In reply to Richard Biener from comment #4) > Note one important step will be to make sure (at least for the shared > libraries) that no symbols are removed. Ideally that should be the case for > the static libraries as well - we've had multiple issues here in the past > with lapack > deprecating APIs and not building them by default when not specially asked. > > That might be OK for openSUSE but for old SLES codestreams we have to be > careful. I'm not sure for SLE15 -> SLE16 compatibility but breaking the > shared library ABI would be bad. Could you suggest what SLE versions I should test the update against? I see that OBS allows me to add SLE-11-SP4, SLE-12-SP5, and SLE-15-SP1 through SP6 as build repositories to the project. (In reply to Atri Bhattacharya from comment #7) > (In reply to Richard Biener from comment #4) > > Note one important step will be to make sure (at least for the shared > > libraries) that no symbols are removed. Ideally that should be the case for > > the static libraries as well - we've had multiple issues here in the past > > with lapack > > deprecating APIs and not building them by default when not specially asked. > > > > That might be OK for openSUSE but for old SLES codestreams we have to be > > careful. I'm not sure for SLE15 -> SLE16 compatibility but breaking the > > shared library ABI would be bad. > > Could you suggest what SLE versions I should test the update against? I see > that OBS allows me to add SLE-11-SP4, SLE-12-SP5, and SLE-15-SP1 through SP6 > as build repositories to the project. I think the best is to check SLE15 SP6 or simply what's in Leap 15.5 or 15.6 (I suppose we don't have any newer version in Backports). Note that on SLE15 lapack is built against the SLE-15:Update tree (thus against GA), but this shouldn't make a difference with respect to the ABI. @Atri: Indeed, Leap 15.5 / 15.6 have the same Lapack package as SLE. Moreover, at present, there is only one code stream of Lapack on Leap/SLE 15, this means that the version of Lapack is the same across all supported Leap/SLE versions. As for baselibs.conf, you should be able to keep the existing one in Lapack as it appears to be sufficiently generic. Thanks for doing this work! (In reply to Richard Biener from comment #8) > > I think the best is to check SLE15 SP6 or simply what's in Leap 15.5 or 15.6 > (I suppose we don't have any newer version in Backports). Note that on SLE15 > lapack is built against the SLE-15:Update tree (thus against GA), but this > shouldn't make a difference with respect to the ABI. OK, good to know. I will be testing SLE:15-SP6 packages later as part of a sub-project. (In reply to Egbert Eich from comment #9) > @Atri: > Indeed, Leap 15.5 / 15.6 have the same Lapack package as SLE. Moreover, at > present, there is only one code stream of Lapack on Leap/SLE 15, this means > that the version of Lapack is the same across all supported Leap/SLE > versions. > OK, that helps reduce the testing needed, thanks. > As for baselibs.conf, you should be able to keep the existing one in Lapack > as it appears to be sufficiently generic. > Actually, the baselibs.conf were not correct and led to 0-byte /usr/lib/libFOO.so.X files in the -32bit shared lib packages. This is discussed in more detail in bug 1207563. Basically this rendered the -32bit lib packages unusable. This is also fixed in my home branch by ensuring update-alternatives installs the right links (suffixed with _32bit to not conflict with non-biarch shared libs). Hope that clarifies the baselibs situation. Here are my observations, a status report of sorts: Factory packages ================ Number of packages: 465 Status: Ready OBS project: <https://build.opensuse.org/project/show/home:badshah400:lapack2023> Build Failures -------------- None. Test-suite failures ------------------- Two minor issues were identified from amongst ~450 packages that caused build failures with their test suites (i.e. %check sections). * python-numba: new test failure due to tolerance issue, reported upstream: <https://github.com/numba/numba/issues/9560> * python-sherpa: test failure due to parallel test runs, probably unrelated to lapack update, reported upstream <https://github.com/sherpa/sherpa/issues/2031>, sr#1172907 working around the issue by using pytest to run tests serially submitted to devel project And that's it! Leap:15.6 ========= Number of packages: 307 Status: WIP OBS project: <https://build.opensuse.org/project/show/home:badshah400:lapack2023:Leap15> Some packages break due to changes in LAPACK API between version 3.9.0 and 3.12.0. A few packages (~6) have issues with their tests. Build Failures -------------- * opencv3: Fails due to function argument mismatch against updated LAPACK; needs to be updated to at least version 3.4.17 (currently 3.4.16): <https://github.com/opencv/opencv/commit/54c180092d2ca02e0460eac7176cab23890fc11e> * python-scipy: Fails due to function argument mismatch against updated LAPACK; Needs to be updated to at least version 1.4.0 (currently 1.3.3): <https://github.com/scipy/scipy/commit/ea94ea041f79550ec19c379ff65811373ddb5f88> Test-suite failures ------------------- Reasons under investigation 🕵️ * lalburst * lalsimulation * o2scl: tolerance issue in a single test, seems minor * python-nilearn * python-scikit-learn: under investigation * python-traitsui I am also keen on turning on the generation of x86_64-v3 tuned libs using %suse_build_hwcaps_libs, but that effort currently fails: bug 1223967. Any help welcome: <https://build.opensuse.org/package/show/home:badshah400:lapackv3/lapack> Correct, libalternatives is meant as a wrapper around executables (it's kind of exec() wrapper, and /usr/bin/alts is just convenience binary calling this wrapper). It's not a replacement for symlink mechanic that update-alternatives uses. TL;DR I think the update is ready for inclusion into oS:Factory at this point. In addition to the main update itself, stuff we have been able to fix includes: * boo#1207563, including significant testing — not exhaustive, however — to ensure the updated baselibs.conf work in different situations. * Switch to cmake+ninja for builds. I shall send out the sr to lapack devel project (science) tonight. Stuff we would like to work on, but perhaps decouple them from the scope of this update (and this bug) are: * boo#1223967: update-alternatives and x86-64-v3 hwcaps generation do not mix well and our current work-arounds are not yet in a ready-for-submission state (https://build.opensuse.org/package/show/home:badshah400:lapackv3/lapack) * Fedora splits out separate 64-bit integer libraries for 64-bit archs in addition to 32-bit integer libs, to allow either or both 32-bit and 64-bit libraries to be used. This is configured at build time (`-DBUILD_INDEX64=ON`). Perhaps we should look into this as well. Perhaps difficult to work out given the structure of blas, cblas, lapack, and lapacke libraries, but it would be useful to look at installing the man files as part of the appropriate devel package instead of a separate -man package like we have now. The current status is the cleanest from the packaging point of view, but not the most end-user/dev friendly (lapack-man needs to be installed manually). @Atri, sorry for not looking into this earlier, I've been side tracked by a huge security backport. I'll have a deeper look tomorrow. (In reply to Egbert Eich from comment #16) > @Atri, sorry for not looking into this earlier, I've been side tracked by a > huge security backport. > I'll have a deeper look tomorrow. No worries, take your time. I am just grateful for all the helpful suggestions, pointers, etc. you have already sent my way. @Atri, thanks for doing all this work! (In reply to Atri Bhattacharya from comment #14) > > Perhaps difficult to work out given the structure of blas, cblas, > lapack, and lapacke libraries, but it would be useful to look at > installing the man files as part of the appropriate devel package > instead of a separate -man package like we have now. The current status > is the cleanest from the packaging point of view, but not the most > end-user/dev friendly (lapack-man needs to be installed manually). Wouldn't it work to make the man page package a dependency (ie Recommends:) of the -devel packages? This way, one can avoid installation if not required but it will be installed by default. Would you mind setting the package in your home to 'publish' so I'm actually able to install it? (In reply to Egbert Eich from comment #18) > @Atri, thanks for doing all this work! No problem, happy to lend a hand. > > (In reply to Atri Bhattacharya from comment #14) > > > > Perhaps difficult to work out given the structure of blas, cblas, > > lapack, and lapacke libraries, but it would be useful to look at > > installing the man files as part of the appropriate devel package > > instead of a separate -man package like we have now. The current status > > is the cleanest from the packaging point of view, but not the most > > end-user/dev friendly (lapack-man needs to be installed manually). > > Wouldn't it work to make the man page package a dependency (ie Recommends:) > of the -devel packages? This way, one can avoid installation if not required > but it will be installed by default. I think this would be better than the current status quo, but it would still entail recommending a bunch of lapack/cblas/lapacke man files for someone only interested in pure blas, for example. We could do this for now, as a stop-gap, until we can figure out a proper split of the man files sometime in the future. > > Would you mind setting the package in your home to 'publish' so I'm actually > able to install it? Done, sorry for overlooking this earlier when I set up the repo. (In reply to Atri Bhattacharya from comment #19) > > > > Would you mind setting the package in your home to 'publish' so I'm actually > > able to install it? > > Done, sorry for overlooking this earlier when I set up the repo. Thanks! I will test later today. BTW: I've created https://github.com/openSUSE/obs-build/issues/1010 as I believe it is still worthwhile to pursue. I've noticed you're doing:
%{_sysconfdir}/alternatives/libcblas.so.%{so_ver}_%{_arch}
we used to do:
%{_sysconfdir}/alternatives/libcblas.so.3%{?a_x}
defining %a_x as:
%if 0%{?suse_version} > 1500
%define a_x _%{_arch}
%endif
which was introduced with:
https://build.opensuse.org/package/rdiff/science/lapack?linkrev=base&rev=35
I don't remember why we made this conditional to disable it for SLE/Leap.
These name extensions appear to be harmless and are needed to distinguish build targets.
This may have helped around errors during staging for SLE-15 due to some faulty test.
@Richi - do you remember?
(In reply to Egbert Eich from comment #21) > I've noticed you're doing: > %{_sysconfdir}/alternatives/libcblas.so.%{so_ver}_%{_arch} > we used to do: > %{_sysconfdir}/alternatives/libcblas.so.3%{?a_x} > defining %a_x as: > %if 0%{?suse_version} > 1500 > %define a_x _%{_arch} > %endif > which was introduced with: > https://build.opensuse.org/package/rdiff/science/lapack?linkrev=base&rev=35 > Yes, I intentionally removed the conditionals and enabled _arch dependent link names to test baselibs generation for Leap. This change is not strictly needed — Leap does not build for the i586 arch anyway — and I can simply restore the previous a_x conditional macro if that is preferred. Forgot to add that we will need the _arch dependence if we ever produce the x86-64-v3 enhanced libs using baselibs on Leap too. (In reply to Atri Bhattacharya from comment #23) > Forgot to add that we will need the _arch dependence if we ever produce the > x86-64-v3 enhanced libs using baselibs on Leap too. Right, but I'm not sure if we need this for SLE15. Maybe, if you could define some macro: %define somemacro _%{_arch} and replace '_%{_arch}' with by %{?somemacro} we can add the SLE/Leap15 handling back quickly if we notice issues. I leave the final name fpr 'somemacro' up to you. Regarding x86-64-v3, I'm not sure if we want to introduce this for 15, still. Right now, I'm unsure if we will be able to use the baselibs heuristics to build both 32-bit and x86_64-v3 as both will require different paths in the update-alternatives postinstall scriptlets. Currently, I don't see a way to achieve this as we only have limited control over what happens when this "magic" happens. It may be easier to mimic this using multibuild flavors. We may not want to wait for this, though, and get the package to factory as it looks pretty good already. (In reply to Egbert Eich from comment #24) > Maybe, if you could define some macro: > > %define somemacro _%{_arch} > and replace '_%{_arch}' with by %{?somemacro} we can add the SLE/Leap15 > handling back quickly if we notice issues. I leave the final name fpr > 'somemacro' up to you. Makes sense. Restored the conditionally defined %{a_x} macro in rev 34 (will supersede the open sr after your tests and a thumbs-up signal). (In reply to Atri Bhattacharya from comment #25) > Makes sense. Restored the conditionally defined %{a_x} macro in rev 34 (will > supersede the open sr after your tests and a thumbs-up signal). Thanks! It looks good, I've just done some very minimal tests but from these I believe we are Ok. Please create an SR. Let's see what staging has to say ;) (In reply to Egbert Eich from comment #26) > Thanks! > It looks good, I've just done some very minimal tests but from these I > believe we are Ok. Please create an SR. Let's see what staging has to say ;) Thanks for testing. Here it is: https://build.opensuse.org/request/show/1180259 Decouple boo#1223967 from the scope of this update. Thanks! I've pushed it to Factory: https://build.opensuse.org/request/show/1180289 I'll watch out for fallout today, I will be unavailable from tomorrow for the rest of the week. @Atri, could you have an eye on this while I'm gone, please? (In reply to Egbert Eich from comment #29) > Thanks! I've pushed it to Factory: > https://build.opensuse.org/request/show/1180289 > I'll watch out for fallout today, I will be unavailable from tomorrow for > the rest of the week. > @Atri, could you have an eye on this while I'm gone, please? I am on it, thanks for letting me know. Superseded by https://build.opensuse.org/request/show/1180788 to fix man file conflict with libm's isnan (reported by Staging:E installcheck). Done, lapack 3.12.0 is in Factory now <https://build.opensuse.org/request/show/1180788> |