Bug 1221902

Summary: (Not) supported CPU features for s390x in numpy
Product: [openSUSE] openSUSE Tumbleweed Reporter: Sarah Kriesch <ada.lovelace>
Component: PythonAssignee: Matej Cepl <mcepl>
Status: RESOLVED FIXED QA Contact: E-mail List <qa-bugs>
Severity: Normal    
Priority: P2 - High CC: fvogt, ihno, marcela.maslanova
Version: Current   
Target Milestone: ---   
Hardware: S/390-64   
OS: Other   
Whiteboard:
Found By: Community User Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---

Description Sarah Kriesch 2024-03-24 11:01:38 UTC
Numpy is a supported Python library on s390x and is required by different packages. Anyway, it is failing with the following error message after a rebuild now.

  978s] 
[  978s] self = <numpy.core.tests.test_cpu_features.TestEnvPrivation object at 0x3ff79374820>
[  978s] msg = 'You cannot enable CPU features \\(VXE\\), since they are not supported by your machine.'
[  978s] err_type = 'RuntimeError', no_error_msg = 'Failed to generate error'
[  978s] 
[  978s]     def _expect_error(
[  978s]         self,
[  978s]         msg,
[  978s]         err_type,
[  978s]         no_error_msg="Failed to generate error"
[  978s]     ):
[  978s]         try:
[  978s]             self._run()
[  978s]         except subprocess.CalledProcessError as e:
[  978s]             assertion_message = f"Expected: {msg}\nGot: {e.stderr}"
[  978s]             assert re.search(msg, e.stderr), assertion_message
[  978s]     
[  978s]             assertion_message = (
[  978s]                 f"Expected error of type: {err_type}; see full "
[  978s]                 f"error:\n{e.stderr}"
[  978s]             )
[  978s]             assert re.search(err_type, e.stderr), assertion_message
[  978s]         else:
[  978s] >           assert False, no_error_msg
[  978s] E           AssertionError: Failed to generate error
[  978s] E           assert False
[  978s] 
[  978s] err_type   = 'RuntimeError'
[  978s] msg        = 'You cannot enable CPU features \\(VXE\\), since they are not supported by your machine.'
[  978s] no_error_msg = 'Failed to generate error'
[  978s] self       = <numpy.core.tests.test_cpu_features.TestEnvPrivation object at 0x3ff79374820>

Is that a configuration issue?
Comment 1 Sarah Kriesch 2024-03-24 11:11:36 UTC
The Numpy documentation is referencing these features as supported on s390x: https://numpy.org/doc/stable/reference/simd/build-options.html#on-ibm-zsystem-s390x
Comment 2 Marcela Maslanova 2024-03-25 08:10:42 UTC
I'm getting errors for some packages:
Fatal glibc error: CPU lacks VXE support (z14 or later required)

You can fix the build failure by adding this into constraints file. That should give you a machine with VXE support during the build time.
%ifarch s390x
Constraint: hardware:cpu:flag vxe
%endif
Comment 3 Sarah Kriesch 2024-03-25 13:13:00 UTC
I had a small discussion in he openSUSE Factory Libera Chat.
Adding the constraint would declare TW as z15 and the idea was that TW remains z13+ for now so packages must not rely on vxe.

That is a topic for the Package Maintainers.
 <mcepl> AdaLovelace: so is bsc#1221902 actually my problem or not?

Thank you to mcepl for picking up this task.
Comment 4 Marcela Maslanova 2024-03-25 13:17:49 UTC
I would hate to change whole Tumbleweed to z15 only. We would be missing the computing capacity again.
Comment 5 Fabian Vogt 2024-03-25 15:47:35 UTC
Looking at the error message more closely reveals that it's actually the opposite what's happening here. numpy is built for s390x baseline, with vx+vxe+vxe2 as runtime detected optional failures:

[   80s]   CPU Optimization Options
[   80s]     baseline:
[   80s]       Requested : min
[   80s]       Enabled   :
[   80s]     dispatch:
[   80s]       Requested : max -xop -fma4
[   80s]       Enabled   : VX VXE VXE2

The tests try to check whether numpy correctly refuses to enable features not supported by the machine:

[ 1858s]     def test_impossible_feature_enable(self):
[ 1858s]         """
[ 1858s]         Test that a RuntimeError is thrown if an impossible feature-enabling
[ 1858s]         request is made. This includes enabling a feature not supported by the
[ 1858s]         machine, or disabling a baseline optimization.
[ 1858s]         """

It tries to force-enable VXE, which isn't supported by the build host, but it does not throw an exception:

[ 1858s] E           AssertionError: Failed to generate error
[ 1858s] E           assert False

So apparently numpy's runtime detection thinks VXE is supported?

The relevant code is here: https://github.com/numpy/numpy/blob/34fa608064fb2ff825d32f536ad420cf8cee3112/numpy/_core/src/common/npy_cpu_features.c#L636
Comment 7 Sarah Kriesch 2024-03-25 17:20:52 UTC
Then I want to give you the hint, that SUSE has got a z13 and z15 based LinuxONE. I want to suggest keeping the z13 support for a minimum of 1 year because we have calculated to receive a new z17 based LinuxONE after the next release (which should be this year).
Comment 8 Sarah Kriesch 2024-03-25 21:04:43 UTC
(In reply to Fabian Vogt from comment #5)
> Looking at the error message more closely reveals that it's actually the
> opposite what's happening here. numpy is built for s390x baseline, with
> vx+vxe+vxe2 as runtime detected optional failures:
> 
> [   80s]   CPU Optimization Options
> [   80s]     baseline:
> [   80s]       Requested : min
> [   80s]       Enabled   :
> [   80s]     dispatch:
> [   80s]       Requested : max -xop -fma4
> [   80s]       Enabled   : VX VXE VXE2
> 
> So apparently numpy's runtime detection thinks VXE is supported?

It seems so. It depends now also on the SLE side how important the z13 with Python is for the Enterprise customers. Does it make sense to reenable the old hardware?

Alternatively, it is easier to use the latest features from z14+ configured in the numpy package.
Comment 9 Sarah Kriesch 2024-03-26 08:33:31 UTC
(In reply to Marcela Maslanova from comment #2)
> I'm getting errors for some packages:
> Fatal glibc error: CPU lacks VXE support (z14 or later required)

Which other packages are also affected by this error message based on missing CPU support for z13?
Comment 10 Marcela Maslanova 2024-03-26 10:43:19 UTC
(In reply to Sarah Kriesch from comment #9)
> (In reply to Marcela Maslanova from comment #2)
> > I'm getting errors for some packages:
> > Fatal glibc error: CPU lacks VXE support (z14 or later required)
> 
> Which other packages are also affected by this error message based on
> missing CPU support for z13?

I can't really say which package exactly, it belongs into SUSE-Manager.
Comment 11 Ihno Krumreich 2024-03-27 17:26:36 UTC
@ihno: provide z13 Instance. (LPAR preferred)
Comment 12 Fabian Vogt 2024-03-28 14:20:14 UTC
I found that the test failure here was also excluded on 32bit ARM and I dug a bit further: https://github.com/numpy/numpy/issues/24548 shows the same reason as here. That's when I also noticed the additional info at the bottom:
feats      = 'VXE, None'

At first I wasn't able to reproduce it on armv7 because cpu feature detection was broken, which nobody noticed... I debugged that and fixed it with https://github.com/numpy/meson/pull/12.

After fixing that, I was able to reproduce the test failure locally by using NPY_ENABLE_CPU_FEATURES="ASIMDHP, None". This wasn't meant to succeed but it did. Fixed with https://github.com/numpy/numpy/pull/26151.

Both submitted as https://build.opensuse.org/request/show/1163337

(In reply to Matej Cepl from comment #6)
> https://mail.python.org/archives/list/numpy-discussion@python.org/thread/
> G76RHLEF5MBDXVTAWOQJKABW6SQGDAST/

Feel free to mention the above PRs on the ML.