Bug 1223783

Summary: Upgrade lapack to 3.12.0
Product: [openSUSE] openSUSE Tumbleweed Reporter: Atri Bhattacharya <badshah400>
Component: OtherAssignee: Atri Bhattacharya <badshah400>
Status: RESOLVED FIXED QA Contact: E-mail List <qa-bugs>
Severity: Normal    
Priority: P5 - None CC: eich, marcela.maslanova, mjambor, rguenther, stefan.bruens
Version: CurrentFlags: eich: needinfo? (rguenther)
Target Milestone: ---   
Hardware: Other   
OS: Other   
See Also: https://bugzilla.opensuse.org/show_bug.cgi?id=1225793
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Bug Depends on: 1207563    
Bug Blocks:    

Description Atri Bhattacharya 2024-05-02 16:58:19 UTC
Our lapack (at version 3.9.0) is pretty ancient at this point. We should consider bumping it straight to version 3.12.0 (released Nov 2023). Given the insane list of dependencies on lapack and its own rather insane list of patches used by Fedora [1], I thought of opening this tracking bug collecting the fallout that may or may not happen as a result of this upgrade (which I am starting work on, but seems more than a one-man job to be honest, so help welcome). The upcoming GCC14 breakage seems to be a good time to work on it since the latest version seems to fix this automatically.

[1] https://src.fedoraproject.org/rpms/lapack/tree/rawhide
Comment 1 Atri Bhattacharya 2024-05-02 21:18:37 UTC
Egbert, Stefan,
I am incubating the update here:
https://build.opensuse.org/package/show/home:badshah400:lapack2023/lapack

The recent versions use cmake, making life considerably easier, and I have started (nearly) from scratch. I shall restore some of the previous features 1/1 (like update-alternatives, or perhaps we should shift to libalternatives: discuss here) but there will also be differences, enforced by the different build system, simplicity of specfile, etc. Would be grateful if you let me know how it is looking. I can grant you (and others who you feel may be interested) user roles to the project if you are willing to join.

Getting the updated package to near-feature-parity vis-à-vis the previous version would complete Step 1, in my opinion.

Step 2 would be to collect all pkgs dependent on lapack from :Factory into this project and see what fails. Fix them.

Step 3 would be to send sr for the lapack update first and then, if necessary, for each fixed package from Step 2.

Step 4 (optional): Package blaspp [1] and lapackpp [2] and enable these features when configuring lapack build.

[1] https://github.com/icl-utk-edu/blaspp
[2] https://github.com/icl-utk-edu/lapackpp
Comment 2 Egbert Eich 2024-05-06 10:18:21 UTC
I had briefly played with this idea when I looked at Lapack last, by I find cmake hard to debug, so I didn't pursue it. I'm fine if you do the conversion and make sure that it works ;)
Regarding update alternatives:
They've given me a lot of grief. OBS staging is extremely picky about links that exist in one alternative but not the other. Unfortunately, you cannot run these tests yourself as there is not 'self-service' staging: you need to submit to Factory first and then let the staging managers check what breaks.

The only reason I went into lapack was to fix issues with update alternatives with respect to the openblas implementation. Presently, things are working and I hope I don't have to touch these pieces ever again (OBS staging is extremely picky about links that exist in one package but not the other).

Regarding switching to 'liblaternatives':
a. AFAIKT, libalternatives uses a binary wrapper (`alts`), thus it is for 
   binaries only, not for libraries (@Adam?)
b. In case my assessment in a. was wrong: 
  If we do this change, we'd have to change OpenBLAS as well. Since  
  libalternatives is not in SLE-15, we would have to introduce it there if we 
  were to update OpenBLAS. This would in turn require an update to lapack - and 
  possibly a code stream-`fork` (as we may not be able to introduce all required
  updates to all supported SLE service packs).
  Things would become much simpler if we kept the update-alternatives in place 
  for the time being and postpone this change to when we know that no new version 
  of OpenBLAS will be required for SLE-15 any more.

Even if we keep update-alternatives for the time being, for SLE (and Leap) we need to make sure that the way they are set up matches the present setting (names and location of links, which links we ship). This can be ensured by leaving update-alternatives in OpenBLAS untouched during this process and make sure we pass Factory staging.
Comment 3 Atri Bhattacharya 2024-05-06 10:40:06 UTC
Many thanks for the feedback.

(In reply to Egbert Eich from comment #2)
> I had briefly played with this idea when I looked at Lapack last, by I find
> cmake hard to debug, so I didn't pursue it. I'm fine if you do the
> conversion and make sure that it works ;)

Yes, the transition to cmake was one reason why I wanted to play it extra safe and test all packages dependent on lapack still build (and if they have %check sections, that they run) fine. And it looks ok so far (big batch of python-* packages to check but so far not issues building any of the packages — that were building against old lapack in science — in home:badshah400:lapack2023).

I have also been testing put the update (installing, updating, removing, running tests) locally over the weekend.

> Regarding update alternatives:
> They've given me a lot of grief. OBS staging is extremely picky about links
> that exist in one alternative but not the other. Unfortunately, you cannot
> run these tests yourself as there is not 'self-service' staging: you need to
> submit to Factory first and then let the staging managers check what breaks.
> 
> The only reason I went into lapack was to fix issues with update
> alternatives with respect to the openblas implementation. Presently, things
> are working and I hope I don't have to touch these pieces ever again (OBS
> staging is extremely picky about links that exist in one package but not the
> other).
> 

Yes, I think your update-alternatives fixes were perfect and they are preserved in the updated package without any change (except for using _%{_arch} and dropping the user-defined %{a_x} macro but that can be easily switched back).

The thing that I have worked on as part of this is to try and understand the baselibs.conf magic and I think I now have the baselibs.conf file in a state that produces working -32bit builds for x86_64 (have no way to test any other archs myself). I basically replicate your update-alternatives scriptlets in the baselibs.conf file with appropriate changes to the syntax.

Tests installing, removing, and upgrading packages involving -32bit dependencies are in progress locally on my machine (so far look good).

We will have to similarly fix the baselibs.conf for OpenBLAS too.

> Regarding switching to 'liblaternatives':
> a. AFAIKT, libalternatives uses a binary wrapper (`alts`), thus it is for 
>    binaries only, not for libraries (@Adam?)
> b. In case my assessment in a. was wrong: 
>   If we do this change, we'd have to change OpenBLAS as well. Since  
>   libalternatives is not in SLE-15, we would have to introduce it there if
> we 
>   were to update OpenBLAS. This would in turn require an update to lapack -
> and 
>   possibly a code stream-`fork` (as we may not be able to introduce all
> required
>   updates to all supported SLE service packs).
>   Things would become much simpler if we kept the update-alternatives in
> place 
>   for the time being and postpone this change to when we know that no new
> version 
>   of OpenBLAS will be required for SLE-15 any more.
> 
> Even if we keep update-alternatives for the time being, for SLE (and Leap)
> we need to make sure that the way they are set up matches the present
> setting (names and location of links, which links we ship). This can be
> ensured by leaving update-alternatives in OpenBLAS untouched during this
> process and make sure we pass Factory staging.

Yes, I agree using libalternatives is not an option at this (or maybe at all) point for lapack.
Comment 4 Richard Biener 2024-05-06 10:41:32 UTC
Note one important step will be to make sure (at least for the shared libraries) that no symbols are removed.  Ideally that should be the case for the static libraries as well - we've had multiple issues here in the past with lapack
deprecating APIs and not building them by default when not specially asked.

That might be OK for openSUSE but for old SLES codestreams we have to be
careful.  I'm not sure for SLE15 -> SLE16 compatibility but breaking the
shared library ABI would be bad.
Comment 5 Atri Bhattacharya 2024-05-06 10:43:32 UTC
Note that there have also been changes to how the multiflavor setup is organised: we now have a main flavour that builds the shared lib and devel packages and a static flavour that builds the static libs and man pages. This is necessitated by the requirement that the cmake build only allows either shared or static libs to be built but not both at the same time. And anyway, it also reads simpler.
Comment 6 Atri Bhattacharya 2024-05-06 10:46:29 UTC
(In reply to Richard Biener from comment #4)
> Note one important step will be to make sure (at least for the shared
> libraries) that no symbols are removed.  Ideally that should be the case for
> the static libraries as well - we've had multiple issues here in the past
> with lapack deprecating APIs and not building them by default when not specially asked.

I thought exactly the same, so we build with `cmake...  -DBUILD_DEPRECATED=ON` to build all deprecated symbols into the shared libs, but I will go over the cmake files to see if there is any additional flags to be enabled for something this does not already take care of.

Thanks for the feedback.
Comment 7 Atri Bhattacharya 2024-05-06 10:54:17 UTC
(In reply to Richard Biener from comment #4)
> Note one important step will be to make sure (at least for the shared
> libraries) that no symbols are removed.  Ideally that should be the case for
> the static libraries as well - we've had multiple issues here in the past
> with lapack
> deprecating APIs and not building them by default when not specially asked.
> 
> That might be OK for openSUSE but for old SLES codestreams we have to be
> careful.  I'm not sure for SLE15 -> SLE16 compatibility but breaking the
> shared library ABI would be bad.

Could you suggest what SLE versions I should test the update against? I see that OBS allows me to add SLE-11-SP4, SLE-12-SP5, and SLE-15-SP1 through SP6 as build repositories to the project.
Comment 8 Richard Biener 2024-05-06 11:25:19 UTC
(In reply to Atri Bhattacharya from comment #7)
> (In reply to Richard Biener from comment #4)
> > Note one important step will be to make sure (at least for the shared
> > libraries) that no symbols are removed.  Ideally that should be the case for
> > the static libraries as well - we've had multiple issues here in the past
> > with lapack
> > deprecating APIs and not building them by default when not specially asked.
> > 
> > That might be OK for openSUSE but for old SLES codestreams we have to be
> > careful.  I'm not sure for SLE15 -> SLE16 compatibility but breaking the
> > shared library ABI would be bad.
> 
> Could you suggest what SLE versions I should test the update against? I see
> that OBS allows me to add SLE-11-SP4, SLE-12-SP5, and SLE-15-SP1 through SP6
> as build repositories to the project.

I think the best is to check SLE15 SP6 or simply what's in Leap 15.5 or 15.6
(I suppose we don't have any newer version in Backports).  Note that on SLE15
lapack is built against the SLE-15:Update tree (thus against GA), but this
shouldn't make a difference with respect to the ABI.
Comment 9 Egbert Eich 2024-05-06 15:54:50 UTC
@Atri:
Indeed, Leap 15.5 / 15.6 have the same Lapack package as SLE. Moreover, at present, there is only one code stream of Lapack on Leap/SLE 15, this means that the version of Lapack is the same across all supported Leap/SLE versions.

As for baselibs.conf, you should be able to keep the existing one in Lapack as it appears to be sufficiently generic.

Thanks for doing this work!
Comment 10 Atri Bhattacharya 2024-05-06 16:56:27 UTC
(In reply to Richard Biener from comment #8)
> 
> I think the best is to check SLE15 SP6 or simply what's in Leap 15.5 or 15.6
> (I suppose we don't have any newer version in Backports).  Note that on SLE15
> lapack is built against the SLE-15:Update tree (thus against GA), but this
> shouldn't make a difference with respect to the ABI.

OK, good to know. I will be testing SLE:15-SP6 packages later as part of a sub-project.

(In reply to Egbert Eich from comment #9)
> @Atri:
> Indeed, Leap 15.5 / 15.6 have the same Lapack package as SLE. Moreover, at
> present, there is only one code stream of Lapack on Leap/SLE 15, this means
> that the version of Lapack is the same across all supported Leap/SLE
> versions.
> 

OK, that helps reduce the testing needed, thanks.

> As for baselibs.conf, you should be able to keep the existing one in Lapack
> as it appears to be sufficiently generic.
> 

Actually, the baselibs.conf were not correct and led to 0-byte /usr/lib/libFOO.so.X files in the -32bit shared lib packages. This is discussed in more detail in bug 1207563. Basically this rendered the -32bit lib packages unusable.  This is also fixed in my home branch by ensuring update-alternatives installs the right links (suffixed with _32bit to not conflict with non-biarch shared libs). Hope that clarifies the baselibs situation.
Comment 11 Atri Bhattacharya 2024-05-09 11:27:59 UTC
Here are my observations, a status report of sorts:

Factory packages
================

Number of packages: 465

Status: Ready

OBS project: <https://build.opensuse.org/project/show/home:badshah400:lapack2023>

Build Failures
--------------

None.

Test-suite failures
-------------------

Two minor issues were identified from amongst ~450 packages that caused build failures with their test suites (i.e. %check sections).

* python-numba: new test failure due to tolerance issue, reported upstream: <https://github.com/numba/numba/issues/9560>
* python-sherpa: test failure due to parallel test runs, probably unrelated to lapack update, reported upstream <https://github.com/sherpa/sherpa/issues/2031>, sr#1172907 working around the issue by using pytest to run tests serially submitted to devel project

And that's it!

Leap:15.6
=========

Number of packages: 307

Status: WIP

OBS project: <https://build.opensuse.org/project/show/home:badshah400:lapack2023:Leap15>

Some packages break due to changes in LAPACK API between version 3.9.0 and 3.12.0. A few packages (~6) have issues with their tests.

Build Failures
--------------

* opencv3: Fails due to function argument mismatch against updated LAPACK; needs to be updated to at least version 3.4.17 (currently 3.4.16): <https://github.com/opencv/opencv/commit/54c180092d2ca02e0460eac7176cab23890fc11e>
* python-scipy: Fails due to function argument mismatch against updated LAPACK; Needs to be updated to at least version 1.4.0 (currently 1.3.3): <https://github.com/scipy/scipy/commit/ea94ea041f79550ec19c379ff65811373ddb5f88>

Test-suite failures
-------------------

Reasons under investigation 🕵️

* lalburst
* lalsimulation
* o2scl: tolerance issue in a single test, seems minor
* python-nilearn
* python-scikit-learn: under investigation
* python-traitsui
Comment 12 Atri Bhattacharya 2024-05-09 14:33:21 UTC
I am also keen on turning on the generation of x86_64-v3 tuned libs using %suse_build_hwcaps_libs, but that effort currently fails: bug 1223967. Any help welcome: <https://build.opensuse.org/package/show/home:badshah400:lapackv3/lapack>
Comment 13 Adam Majer 2024-05-14 08:51:52 UTC
Correct, libalternatives is meant as a wrapper around executables (it's kind of exec() wrapper, and /usr/bin/alts is just convenience binary calling this wrapper). It's not a replacement for symlink mechanic that update-alternatives uses.
Comment 14 Atri Bhattacharya 2024-06-08 06:38:15 UTC
TL;DR I think the update is ready for inclusion into oS:Factory at this point.
In addition to the main update itself, stuff we have been able to fix
includes:

* boo#1207563, including significant testing — not exhaustive, however —
  to ensure the updated baselibs.conf work in different situations.
* Switch to cmake+ninja for builds.

I shall send out the sr to lapack devel project (science) tonight.

Stuff we would like to work on, but perhaps decouple them from the scope
of this update (and this bug) are:

* boo#1223967: update-alternatives and x86-64-v3 hwcaps generation do
  not mix well and our current work-arounds are not yet in a
  ready-for-submission state
  (https://build.opensuse.org/package/show/home:badshah400:lapackv3/lapack)
* Fedora splits out separate 64-bit integer libraries for 64-bit
  archs in addition to 32-bit integer libs, to allow either or both
  32-bit and 64-bit libraries to be used. This is configured at build
  time (`-DBUILD_INDEX64=ON`). Perhaps we should look into this as well.

Perhaps difficult to work out given the structure of blas, cblas,
lapack, and lapacke libraries, but it would be useful to look at
installing the man files as part of the appropriate devel package
instead of a separate -man package like we have now. The current status
is the cleanest from the packaging point of view, but not the most
end-user/dev friendly (lapack-man needs to be installed manually).
Comment 15 Atri Bhattacharya 2024-06-09 09:30:21 UTC
https://build.opensuse.org/request/show/1179559
Comment 16 Egbert Eich 2024-06-09 11:19:32 UTC
@Atri, sorry for not looking into this earlier, I've been side tracked by a huge security backport.
I'll have a deeper look tomorrow.
Comment 17 Atri Bhattacharya 2024-06-09 13:44:29 UTC
(In reply to Egbert Eich from comment #16)
> @Atri, sorry for not looking into this earlier, I've been side tracked by a
> huge security backport.
> I'll have a deeper look tomorrow.

No worries, take your time. I am just grateful for all the helpful suggestions, pointers, etc. you have already sent my way.
Comment 18 Egbert Eich 2024-06-11 11:36:29 UTC
@Atri, thanks for doing all this work!

(In reply to Atri Bhattacharya from comment #14)
> 
> Perhaps difficult to work out given the structure of blas, cblas,
> lapack, and lapacke libraries, but it would be useful to look at
> installing the man files as part of the appropriate devel package
> instead of a separate -man package like we have now. The current status
> is the cleanest from the packaging point of view, but not the most
> end-user/dev friendly (lapack-man needs to be installed manually).

Wouldn't it work to make the man page package a dependency (ie Recommends:) of the -devel packages? This way, one can avoid installation if not required but it will be installed by default.

Would you mind setting the package in your home to 'publish' so I'm actually able to install it?
Comment 19 Atri Bhattacharya 2024-06-11 17:19:26 UTC
(In reply to Egbert Eich from comment #18)
> @Atri, thanks for doing all this work!

No problem, happy to lend a hand.

> 
> (In reply to Atri Bhattacharya from comment #14)
> > 
> > Perhaps difficult to work out given the structure of blas, cblas,
> > lapack, and lapacke libraries, but it would be useful to look at
> > installing the man files as part of the appropriate devel package
> > instead of a separate -man package like we have now. The current status
> > is the cleanest from the packaging point of view, but not the most
> > end-user/dev friendly (lapack-man needs to be installed manually).
> 
> Wouldn't it work to make the man page package a dependency (ie Recommends:)
> of the -devel packages? This way, one can avoid installation if not required
> but it will be installed by default.

I think this would be better than the current status quo, but it would still entail recommending a bunch of lapack/cblas/lapacke man files for someone only interested in pure blas, for example. We could do this for now, as a stop-gap, until we can figure out a proper split of the man files sometime in the future.

> 
> Would you mind setting the package in your home to 'publish' so I'm actually
> able to install it?

Done, sorry for overlooking this earlier when I set up the repo.
Comment 20 Egbert Eich 2024-06-12 05:37:43 UTC
(In reply to Atri Bhattacharya from comment #19)
> > 
> > Would you mind setting the package in your home to 'publish' so I'm actually
> > able to install it?
> 
> Done, sorry for overlooking this earlier when I set up the repo.

Thanks! I will test later today.
BTW: I've created https://github.com/openSUSE/obs-build/issues/1010 as I believe it is still worthwhile to pursue.
Comment 21 Egbert Eich 2024-06-12 08:46:06 UTC
I've noticed you're doing:
%{_sysconfdir}/alternatives/libcblas.so.%{so_ver}_%{_arch}
we used to do:
%{_sysconfdir}/alternatives/libcblas.so.3%{?a_x}
defining %a_x as:
%if 0%{?suse_version} > 1500
%define a_x _%{_arch}
%endif
which was introduced with:
https://build.opensuse.org/package/rdiff/science/lapack?linkrev=base&rev=35

I don't remember why we made this conditional to disable it for SLE/Leap. 
These name extensions appear to be harmless and are needed to distinguish build targets.
This may have helped around errors during staging for SLE-15 due to some faulty test.
@Richi - do you remember?
Comment 22 Atri Bhattacharya 2024-06-12 09:13:15 UTC
(In reply to Egbert Eich from comment #21)
> I've noticed you're doing:
> %{_sysconfdir}/alternatives/libcblas.so.%{so_ver}_%{_arch}
> we used to do:
> %{_sysconfdir}/alternatives/libcblas.so.3%{?a_x}
> defining %a_x as:
> %if 0%{?suse_version} > 1500
> %define a_x _%{_arch}
> %endif
> which was introduced with:
> https://build.opensuse.org/package/rdiff/science/lapack?linkrev=base&rev=35
> 

Yes, I intentionally removed the conditionals and enabled _arch dependent link names to test baselibs generation for Leap. This change is not strictly needed — Leap does not build for the i586 arch anyway — and I can simply restore the previous a_x conditional macro if that is preferred.
Comment 23 Atri Bhattacharya 2024-06-12 09:24:25 UTC
Forgot to add that we will need the _arch dependence if we ever produce the x86-64-v3 enhanced libs using baselibs on Leap too.
Comment 24 Egbert Eich 2024-06-12 10:35:38 UTC
(In reply to Atri Bhattacharya from comment #23)
> Forgot to add that we will need the _arch dependence if we ever produce the
> x86-64-v3 enhanced libs using baselibs on Leap too.

Right, but I'm not sure if we need this for SLE15. 
Maybe, if you could define some macro:

%define somemacro _%{_arch}
and replace '_%{_arch}' with by %{?somemacro} we can add the SLE/Leap15 handling back quickly if we notice issues. I leave the final name fpr 'somemacro' up to you.

Regarding x86-64-v3, I'm not sure if we want to introduce this for 15, still.

Right now, I'm unsure if we will be able to use the baselibs heuristics to build both 32-bit and x86_64-v3 as both will require different paths in the update-alternatives postinstall scriptlets. Currently, I don't see a way to achieve this as we only have limited control over what happens when this "magic" happens.
It may be easier to mimic this using multibuild flavors.
We may not want to wait for this, though, and get the package to factory as it looks pretty good already.
Comment 25 Atri Bhattacharya 2024-06-12 11:32:56 UTC
(In reply to Egbert Eich from comment #24)
> Maybe, if you could define some macro:
> 
> %define somemacro _%{_arch}
> and replace '_%{_arch}' with by %{?somemacro} we can add the SLE/Leap15
> handling back quickly if we notice issues. I leave the final name fpr
> 'somemacro' up to you.

Makes sense. Restored the conditionally defined %{a_x} macro in rev 34 (will supersede the open sr after your tests and a thumbs-up signal).
Comment 26 Egbert Eich 2024-06-12 16:08:09 UTC
(In reply to Atri Bhattacharya from comment #25)

> Makes sense. Restored the conditionally defined %{a_x} macro in rev 34 (will
> supersede the open sr after your tests and a thumbs-up signal).

Thanks! 
It looks good, I've just done some very minimal tests but from these I believe we are Ok. Please create an SR. Let's see what staging has to say ;)
Comment 27 Atri Bhattacharya 2024-06-12 18:15:45 UTC
(In reply to Egbert Eich from comment #26)
> Thanks! 
> It looks good, I've just done some very minimal tests but from these I
> believe we are Ok. Please create an SR. Let's see what staging has to say ;)

Thanks for testing. Here it is: https://build.opensuse.org/request/show/1180259
Comment 28 Atri Bhattacharya 2024-06-12 18:17:13 UTC
Decouple boo#1223967 from the scope of this update.
Comment 29 Egbert Eich 2024-06-13 06:03:05 UTC
Thanks! I've pushed it to Factory:
https://build.opensuse.org/request/show/1180289
I'll watch out for fallout today, I will be unavailable from tomorrow for the rest of the week.
@Atri, could you have an eye on this while I'm gone, please?
Comment 30 Atri Bhattacharya 2024-06-13 06:47:50 UTC
(In reply to Egbert Eich from comment #29)
> Thanks! I've pushed it to Factory:
> https://build.opensuse.org/request/show/1180289
> I'll watch out for fallout today, I will be unavailable from tomorrow for
> the rest of the week.
> @Atri, could you have an eye on this while I'm gone, please?

I am on it, thanks for letting me know.
Comment 31 Atri Bhattacharya 2024-06-14 03:49:39 UTC
Superseded by https://build.opensuse.org/request/show/1180788 to fix man file conflict with libm's isnan (reported by Staging:E installcheck).
Comment 32 Atri Bhattacharya 2024-06-14 17:50:02 UTC
Done, lapack 3.12.0 is in Factory now <https://build.opensuse.org/request/show/1180788>