Hi folks,
I don't follow numpy development in much detail these days but I see
that there is a 2.0 release planned soon.
Would this be an opportunity to change the behaviour of 'reduceat'?
This issue has been open in some form since 2006!
https://github.com/numpy/numpy/issues/834
The current behaviour was originally inherited from Numeric, and makes
reduceat often unusable in practice, even where it should be the
perfect, concise, efficient solution. But it has been impossible to
change it without breaking compatibіlity with existing code.
As a result, horrible hacks are needed instead, e.g. my answer here:
https://stackoverflow.com/questions/57694003
Is this something that could finally be fixed in 2.0?
Martin
Hi all,
Our next Documentation Team meeting will happen on *Monday, December 4* at *11PM UTC*. If this time slot is inconvenient for you to join, please let me know in the replies or Slack and we will try to add another time slot.
All are welcome - you don't need to already be a contributor to join. If you have questions or are curious about what we're doing, we'll be happy to meet you!
If you wish to join on Zoom, use this (updated) link:
https://numfocus-org.zoom.us/j/85016474448?pwd=TWEvaWJ1SklyVEpwNXUrcHV1YmFJQ...
Here's the permanent hackmd document with the meeting notes (still being
updated):
https://hackmd.io/oB_boakvRqKR-_2jRV-Qjg
Hope to see you around!
Best wishes,
Mukulika
`cumsum` computes the sum of the first k summands for every k from 1. Judging by my experience, it is more often useful to compute the sum of the first k summands for every k from 0, as `cumsum`'s behaviour leads to fencepost-like problems.
https://en.wikipedia.org/wiki/Off-by-one_error#Fencepost_error
For example, `cumsum` is not the inverse of `diff`. I propose adding a function to NumPy to compute cumulative sums beginning with 0, that is, an inverse of `diff`. It might be called `cumsum0`. The following code is probably not the best way to implement it, but it illustrates the desired behaviour.
```
def cumsum0(a, axis=None, dtype=None, out=None):
"""
Return the cumulative sum of the elements along a given axis,
beginning with 0.
cumsum0 does the same as cumsum except that cumsum computes the sum
of the first k summands for every k from 1 and cumsum, from 0.
Parameters
----------
a : array_like
Input array.
axis : int, optional
Axis along which the cumulative sum is computed. The default
(None) is to compute the cumulative sum over the flattened
array.
dtype : dtype, optional
Type of the returned array and of the accumulator in which the
elements are summed. If `dtype` is not specified, it defaults to
the dtype of `a`, unless `a` has an integer dtype with a
precision less than that of the default platform integer. In
that case, the default platform integer is used.
out : ndarray, optional
Alternative output array in which to place the result. It must
have the same shape and buffer length as the expected output but
the type will be cast if necessary. See
:ref:`ufuncs-output-type` for more details.
Returns
-------
cumsum0_along_axis : ndarray.
A new array holding the result is returned unless `out` is
specified, in which case a reference to `out` is returned. If
`axis` is not None the result has the same shape as `a` except
along `axis`, where the dimension is smaller by 1.
See Also
--------
cumsum : Cumulatively sum array elements, beginning with the first.
sum : Sum array elements.
trapz : Integration of array values using the composite trapezoidal rule.
diff : Calculate the n-th discrete difference along given axis.
Notes
-----
Arithmetic is modular when using integer types, and no error is
raised on overflow.
``cumsum0(a)[-1]`` may not be equal to ``sum(a)`` for floating-point
values since ``sum`` may use a pairwise summation routine, reducing
the roundoff-error. See `sum` for more information.
Examples
--------
>>> a = np.array([[1, 2, 3], [4, 5, 6]])
>>> a
array([[1, 2, 3],
[4, 5, 6]])
>>> np.cumsum0(a)
array([ 0, 1, 3, 6, 10, 15, 21])
>>> np.cumsum0(a, dtype=float) # specifies type of output value(s)
array([ 0., 1., 3., 6., 10., 15., 21.])
>>> np.cumsum0(a, axis=0) # sum over rows for each of the 3 columns
array([[0, 0, 0],
[1, 2, 3],
[5, 7, 9]])
>>> np.cumsum0(a, axis=1) # sum over columns for each of the 2 rows
array([[ 0, 1, 3, 6],
[ 0, 4, 9, 15]])
``cumsum(b)[-1]`` may not be equal to ``sum(b)``
>>> b = np.array([1, 2e-9, 3e-9] * 1000000)
>>> np.cumsum0(b)[-1]
1000000.0050045159
>>> b.sum()
1000000.0050000029
"""
empty = a.take([], axis=axis)
zero = empty.sum(axis, dtype=dtype, keepdims=True)
later_cumsum = a.cumsum(axis, dtype=dtype)
return concatenate([zero, later_cumsum], axis=axis, dtype=dtype, out=out)
```
The next NumPy triage meeting will be held this Wednesday, January 11th at
5 pm UTC. (Please note the new time!) This is a meeting where we
synchronously triage prioritized PRs and issues.
Join us via Zoom:
https://numfocus-org.zoom.us/j/82096749952?pwd=MW9oUmtKQ1c3a2gydGk1RTdYUUVX…
Everyone is welcome to attend and contribute to a conversation.
Please notify us of issues or PRs that you’d like to have reviewed by
adding a GitHub link to them in the meeting agenda:
https://hackmd.io/68i_JvOYQfy9ERiHgXMPvg.
--
Cheers,
Inessa
Inessa Pawson
Contributor Experience Lead | NumPy
https://numpy.org/
GitHub: inessapawson
Hello all,
I just opened a pull request to add NEP 55, see
https://github.com/numpy/numpy/pull/24483.
Per NEP 0, I've copied everything up to the "detailed description" section
below.
I'm looking forward to your feedback on this.
-Nathan Goldbaum
=========================================================
NEP 55 — Add a UTF-8 Variable-Width String DType to NumPy
=========================================================
:Author: Nathan Goldbaum <ngoldbaum(a)quansight.com>
:Status: Draft
:Type: Standards Track
:Created: 2023-06-29
Abstract
--------
We propose adding a new string data type to NumPy where each item in the
array
is an arbitrary length UTF-8 encoded string. This will enable performance,
memory usage, and usability improvements for NumPy users, including:
* Memory savings for workflows that currently use fixed-width strings and
store
primarily ASCII data or a mix of short and long strings in a single NumPy
array.
* Downstream libraries and users will be able to move away from object
arrays
currently used as a substitute for variable-length string arrays, unlocking
performance improvements by avoiding passes over the data outside of NumPy.
* A more intuitive user-facing API for working with arrays of Python
strings,
without a need to think about the in-memory array representation.
Motivation and Scope
--------------------
First, we will describe how the current state of support for string or
string-like data in NumPy arose. Next, we will summarize the last major
previous
discussion about this topic. Finally, we will describe the scope of the
proposed
changes to NumPy as well as changes that are explicitly out of scope of this
proposal.
History of String Support in Numpy
**********************************
Support in NumPy for textual data evolved organically in response to early
user
needs and then changes in the Python ecosystem.
Support for strings was added to numpy to support users of the NumArray
``chararray`` type. Remnants of this are still visible in the NumPy API:
string-related functionality lives in ``np.char``, to support the obsolete
``np.char.chararray`` class, deprecated since NumPy 1.4 in favor of string
DTypes.
NumPy's ``bytes_`` DType was originally used to represent the Python 2 ``str
``
type before Python 3 support was added to NumPy. The bytes DType makes the
most
sense when it is used to represent Python 2 strings or other null-terminated
byte sequences. However, ignoring data after the first null character means
the
``bytes_`` DType is only suitable for bytestreams that do not contain
nulls, so
it is a poor match for generic bytestreams.
The ``unicode`` DType was added to support the Python 2 ``unicode`` type. It
stores data in 32-bit UCS-4 codepoints (e.g. a UTF-32 encoding), which
makes for
a straightforward implementation, but is inefficient for storing text that
can
be represented well using a one-byte ASCII or Latin-1 encoding. This was
not a
problem in Python 2, where ASCII or mostly-ASCII text could use the Python 2
``str`` DType (the current ``bytes_`` DType).
With the arrival of Python 3 support in NumPy, the string DTypes were
largely
left alone due to backward compatibility concerns, although the unicode
DType
became the default DType for ``str`` data and the old ``string`` DType was
renamed the ``bytes_`` DType. This change left NumPy with the sub-optimal
situation of shipping a data type originally intended for null-terminated
bytestrings as the data type for *all* python ``bytes`` data, and a default
string type with an in-memory representation that consumes four times as
much
memory as needed for ASCII or mostly-ASCII data.
Problems with Fixed-Width Strings
*********************************
Both existing string DTypes represent fixed-width sequences, allowing
storage of
the string data in the array buffer. This avoids adding out-of-band storage
to
NumPy, however, it makes for an awkward user interface. In particular, the
maximum string size must be inferred by NumPy or estimated by the user
before
loading the data into a NumPy array or selecting an output DType for string
operations. In the worst case, this requires an expensive pass over the full
dataset to calculate the maximum length of an array element. It also wastes
memory when array elements have varying lengths. Pathological cases where an
array stores many short strings and a few very long strings are
particularly bad
for wasting memory.
Downstream usage of string data in NumPy arrays has proven out the need for
a
variable-width string data type. In practice, most downstream users employ
``object`` arrays for this purpose. In particular, ``pandas`` has explicitly
deprecated support for NumPy fixed-width strings, coerces NumPy fixed-width
string arrays to ``object`` arrays, and in the future may switch to only
supporting string data via ``PyArrow``, which has native support for UTF-8
encoded variable-width string arrays [1]_. This is unfortunate, since ``
object``
arrays have no type guarantees, necessitating expensive sanitization passes
and
operations using object arrays cannot release the GIL.
Previous Discussions
--------------------
The project last discussed this topic in depth in 2017, when Julian Taylor
proposed a fixed-width text data type parameterized by an encoding [2]_.
This
started a wide-ranging discussion about pain points for working with string
data
in NumPy and possible ways forward.
In the end, the discussion identified two use-cases that the current
support for
strings does a poor job of handling:
* Loading or memory-mapping scientific datasets with unknown encoding,
* Working with string data in a manner that allows transparent conversion
between NumPy arrays and Python strings, including support for missing
strings.
As a result of this discussion, improving support for string data was added
to
the NumPy project roadmap [3]_, with an explicit call-out to add a DType
better
suited to memory-mapping bytes with any or no encoding, and a variable-width
string DType that supports missing data to replace usages of object string
arrays.
Proposed work
-------------
This NEP proposes adding ``StringDType``, a DType that stores variable-width
heap-allocated strings in Numpy arrays, to replace downstream usages of the
``object`` DType for string data. This work will heavily leverage recent
improvements in NumPy to improve support for user-defined DTypes, so we will
also necessarily be working on the data type internals in NumPy. In
particular,
we propose to:
* Add a new variable-length string DType to NumPy, targeting NumPy 2.0.
* Work out issues related to adding a DType implemented using the
experimental
DType API to NumPy itself.
* Support for a user-provided missing data sentinel.
* A cleanup of ``np.char``, with the ufunc-like functions moved to a new
namespace for functions and types related to string support.
* An update to the ``npy`` and ``npz`` file formats to allow storage of
arbitrary-length sidecar data.
The following is out of scope for this work:
* Changing DType inference for string data.
* Adding a DType for memory-mapping text in unknown encodings or a DType
that
attempts to fix issues with the ``bytes_`` DType.
* Fully agreeing on the semantics of a missing data sentinels or adding a
missing data sentinel to NumPy itself.
* Implement fast ufuncs or SIMD optimizations for string operations.
While we're explicitly ruling out implementing these items as part of this
work,
adding a new string DType helps set up future work that does implement some
of
these items.
If implemented this NEP will make it easier to add a new fixed-width text
DType
in the future by moving string operations into a long-term supported
namespace. We are also proposing a memory layout that should be amenable to
writing fast ufuncs and SIMD optimization in some cases, increasing the
payoff
for writing string operations as SIMD-optimized ufuncs in the future.
While we are not proposing adding a missing data sentinel to NumPy, we are
proposing adding support for an optional, user-provided missing data
sentinel,
so this does move NumPy a little closer to officially supporting missing
data. We are attempting to avoid resolving the disagreement described in
:ref:`NEP 26<NEP26>` and this proposal does not require or preclude adding a
missing data sentinel or bitflag-based missing data support in the future.
Usage and Impact
----------------
The DType is intended as a drop-in replacement for object string arrays.
This
means that we intend to support as many downstream usages of object string
arrays as possible, including all supported NumPy functionality. Pandas is
the
obvious first user, and substantial work has already occurred to add
support in
a fork of Pandas. ``scikit-learn`` also uses object string arrays and will
be
able to migrate to a DType with guarantees that the arrays contains only
strings. Both h5py [4]_ and PyTables [5]_ will be able to add first-class
support for variable-width UTF-8 encoded string datasets in HDF5. String
data
are heavily used in machine-learning workflows and downstream machine
learning
libraries will be able to leverage this new DType.
Users who wish to load string data into NumPy and leverage NumPy features
like
fancy advanced indexing will have a natural choice that offers substantial
memory savings over fixed-width unicode strings and better validation
guarantees
and overall integration with NumPy than object string arrays. Moving to a
first-class string DType also removes the need to acquire the GIL during
string
operations, unlocking future optimizations that are impossible with object
string arrays.
Performance
***********
Here we briefly describe preliminary performance measurements of the
prototype
version of ``StringDType`` we have implemented outside of NumPy using the
experimental DType API. All benchmarks in this section were performed on a
Dell
XPS 13 9380 running Ubuntu 22.04 and Python 3.11.3 compiled using pyenv.
NumPy,
Pandas, and the ``StringDType`` prototype were all compiled with meson
release
builds.
Currently, the ``StringDType`` prototype has comparable performance with
object
arrays and fixed-width string arrays. One exception is array creation from
python strings, performance is somewhat slower than object arrays and
comparable
to fixed-width unicode arrays::
In [1]: from stringdtype import StringDType
In [2]: import numpy as np
In [3]: data = [str(i) * 10 for i in range(100_000)]
In [4]: %timeit arr_object = np.array(data, dtype=object)
3.55 ms ± 51.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [5]: %timeit arr_stringdtype = np.array(data, dtype=StringDType())
12.9 ms ± 277 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [6]: %timeit arr_strdtype = np.array(data, dtype=str)
11.7 ms ± 150 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In this example, object DTypes are substantially faster because the objects
in
the ``data`` list can be directly interned in the array, while ``StrDType``
and
``StringDType`` need to copy the string data and ``StringDType`` needs to
convert the data to UTF-8 and perform additional heap allocations outside
the
array buffer. In the future, if Python moves to a UTF-8 internal
representation
for strings, the string loading performance of ``StringDType`` should
improve.
String operations have similar performance::
In [7]: %timeit np.array([s.capitalize() for s in data], dtype=object)
30.2 ms ± 109 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [8]: %timeit np.char.capitalize(arr_stringdtype)
38.5 ms ± 3.01 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [9]: %timeit np.char.capitalize(arr_strdtype)
46.4 ms ± 1.32 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
The poor performance here is a reflection of the slow iterator-based
implementation of operations in ``np.char``. If we were to rewrite these
operations as ufuncs, we could unlock substantial performance
improvements. Using the example of the ``add`` ufunc, which we have
implemented
for the ``StringDType`` prototype::
In [10]: %timeit arr_object + arr_object
10 ms ± 308 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [11]: %timeit arr_stringdtype + arr_stringdtype
5.91 ms ± 18.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [12]: %timeit np.char.add(arr_strdtype, arr_strdtype)
65.9 ms ± 1.3 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
As described below, we have already updated a fork of Pandas to use a
prototype
version of ``StringDType``. This demonstrates the performance improvements
available when data are already loaded into a NumPy array and are passed to
a
third-party library. Currently Pandas attempts to coerce all ``str`` data to
``object`` DType by default, and has to check and sanitize existing ``object
``
arrays that are passed in. This requires a copy or pass over the data made
unnecessary by first-class support for variable-width strings in both NumPy
and
Pandas::
In [13]: import pandas as pd
In [14]: %timeit pd.Series(arr_stringdtype)
20.9 µs ± 341 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
In [15]: %timeit pd.Series(arr_object)
1.08 ms ± 23.4 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
We have also implemented a Pandas extension DType that uses ``StringDType``
under the hood, which is also substantially faster for creating Pandas data
structures than the existing Pandas string DType that uses ``object``
arrays::
In [16]: %timeit pd.Series(arr_stringdtype, dtype='string[numpy]')
54.7 µs ± 1.38 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
In [17]: %timeit pd.Series(arr_object, dtype='string[python]')
1.39 ms ± 1.16 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
Backward compatibility
----------------------
We are not proposing a change to DType inference for python strings and do
not
expect to see any impacts on existing usages of NumPy, besides warnings or
errors related to new deprecations or expiring deprecations in ``np.char``.
Hi All,
A long-standing, small wart in numpy has been that the definition of sign for complex numbers is really useless (`np.sign(z)` gives the sign of the real component, unless that is zero, in which case it gives the sign of the imaginary component, in both cases as a complex number with zero imaginary part). Useless enough, in fact, that in the Array API it is suggested [1] that sign should return `z / |z|` (a definition consistent with those of reals, giving the direction in the complex plane).
The question then becomes what to do. My suggestion - see https://github.com/numpy/numpy/pull/25441 - is to adapt the Array API definition for numpy 2.0, with the logic that if we don't change it in 2.0, when will we?
Implementing it, I found no real test failures except one for `np.geomspace`, where it turned out that to correct the failure, the new definition substantially simplified the implementation.
Furthermore, with the redefinition, it has become possible to extend ``np.copysign(x1, x2)`` to complex numbers, since it can now generally return ``|x1| * sign(x2)`` with the sign as defined above (with no special treatment for zero).
Anyway, to me the main question would be whether this would break any workflows (though it is hard to see how it could, given that the previous definition was really rather useless...).
Thanks,
https://github.com/data-apis/array-api/pull/556,
Marten
[1] https://data-apis.org/array-api/latest/API_specification/generated/array_ap… (and https://github.com/data-apis/array-api/pull/556, which has links to previous discussion)
On Sat, Dec 30, 2023 at 1:57 PM Dr. Thomas Orgis <
thomas.orgis(a)uni-hamburg.de> wrote:
>
> Am Fri, 29 Dec 2023 11:34:04 +0100
> schrieb Ralf Gommers <ralf.gommers(a)gmail.com>:
>
> > If the library name is libcblas.so it will still be found. If it's also a
> > nonstandard name, then yes it's going to fail. I'd say though that (a)
> this
> > isn't a real-world situation as far as we know,
>
> It can be more funny. I just notied on an Ubuntu system (following
> Debian for sure, here) that there are both
>
> /usr/lib/x86_64-linux-gnu/libblas.so.3
> /usr/lib/x86_64-linux-gnu/libcblas.so.3
>
> but those belong to different packages. The first contains BLAS and
> CBLAS API and is installed from netlib code.
>
> $ readelf -d -s /usr/lib/x86_64-linux-gnu/libblas.so.3 | grep cblas_ | wc
> -l
> 184
>
> The second is installed alongside ATLAS.
>
> $ readelf -d -s /usr/lib/x86_64-linux-gnu/libcblas.so.3 | grep cblas_ |
> wc -l
> 154
>
> The symbols lists differ in that there are both functions unique to both.
>
> $ ldd /usr/lib/x86_64-linux-gnu/libcblas.so.3
> linux-vdso.so.1 (0x00007ffcb5720000)
> libatlas.so.3 => /lib/x86_64-linux-gnu/libatlas.so.3
> (0x00007fd9b27ee000)
> libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fd9b25c6000)
> libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fd9b24df000)
> /lib64/ld-linux-x86-64.so.2 (0x00007fd9b2bae000)
>
> I _guess_ this situation would be mostly fine since libblas has enough
> of the CBLAS symbols to prevent location of the wrong libcblas next to
> it by the meson search.
>
> Quick followup regarding netlib splits. Debian only recently folded
> libcblas into libblas, as
>
> https://lists.debian.org/debian-devel/2019/10/msg00273.html
>
> notes. Not that long ago … esp. considering stable debian. Not sure
> when this appeared. And of course numpy is the point where things were
> broken:
>
> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=913567
>
> I'm now looking into how Debian actually produces a combined BLAS+CBLAS
> from netlib, as we're using the CMake build system and I do not see an
> option to do that. The upstream build produces separate libraries, so I
> assumed that is a case that one should handle.
Yes, Debian made quite a mess there. We do have a CI job for Netlib on
Debain though in NumPy, and indeed it works fine because of the CBLAS
symbols already being found inside libblas.so
> But it is a demonstration that any guess that libcblas belongs to
> libblas just from the name may be wrong in real-world installations.
>
Letting this sink in some more, I realized the more fundamental reason for
treating them together: when we express dependencies, we do so for a
*package* (i.e., a packaged version of some project), not for a specific
build output like a shared library or a header file. In this case it's a
little obscured by BLAS being an interface and the libblas/libcblas mix,
but it's still the case that we're looking for multiple installed things
from a single package. So we want "MKL" or "Netlib BLAS", where MKL is not
only a shared library (or set of them), but for example also the
corresponding header file (mkl_cblas.h rather than cblas.h). The situation
you are worrying about is basically that of an unknown package with a set
of shared libraries and headers that have non-standard names. I'd say that
that's then simply a non-supported package, until someone comes to report
the situation and we can add support for it (or file a bug against that
package and convince the authors not to make such a mess).
I think this point is actually important, and I hope you can appreciate it
as a packager - we need to depend on packages (things that have URLs to
source repos, maintainers, etc.), not random library names.
>
> Here, it might be a strange installation remnant.
>
> $ dpkg -L libatlas3-base
> /.
> /usr
> /usr/lib
> /usr/lib/x86_64-linux-gnu
> /usr/lib/x86_64-linux-gnu/atlas
> /usr/lib/x86_64-linux-gnu/atlas/libblas.so.3.10.3
> /usr/lib/x86_64-linux-gnu/atlas/liblapack.so.3.10.3
> /usr/lib/x86_64-linux-gnu/libatlas.so.3.10.3
> /usr/lib/x86_64-linux-gnu/libcblas.so.3.10.3
> /usr/lib/x86_64-linux-gnu/libf77blas.so.3.10.3
> /usr/lib/x86_64-linux-gnu/liblapack_atlas.so.3.10.3
> /usr/share
> /usr/share/doc
> /usr/share/doc/libatlas3-base
> /usr/share/doc/libatlas3-base/README.Debian
> /usr/share/doc/libatlas3-base/changelog.Debian.gz
> /usr/share/doc/libatlas3-base/copyright
> /usr/lib/x86_64-linux-gnu/atlas/libblas.so.3
> /usr/lib/x86_64-linux-gnu/atlas/liblapack.so.3
> /usr/lib/x86_64-linux-gnu/libatlas.so.3
> /usr/lib/x86_64-linux-gnu/libcblas.so.3
> /usr/lib/x86_64-linux-gnu/libf77blas.so.3
> /usr/lib/x86_64-linux-gnu/liblapack_atlas.so.3
>
> An eclectic list of redundant libraries. But as it is, this is a case where
>
> 1. libatlas has BLAS, libcblas has corresponding CBLAS (only
> referencing ATLAS-specific ABI).
>
> 2. libcblas does _not_ work together with libblas (uses ATL_ symbols).
>
I'll note that the ATL_ symbols prefixes are an internal implementation
detail; we don't prefix our own symbols when calling BLAS APIs. And there
is no way Debian lets you install both of Netlib BLAS and ATLAS (it's one
or the other, through `update-alternatives`), so this can't really go wrong
in the way you imagine I think.
>
> You might consider this full of packaging errors/weirdness, but it's there.
>
> Also, I note that
>
> 1. Netlib libblas.so does contain BLAS+CBLAS symbols.
> 2. Netlib liblapack.so does only ontain LAPACK symbols.
> 3. Netlib liblapacke.so exists and provides LAPACKE symbols on top.
>
> So from my full scenario, only the separate libcblas is missing. This
> hybrid is somewhat unfortunate … but, well, this liblapacke, you
> actually could combine with different liblapack implementations.
>
> In any case, I would feel more comfortable if I knew that the build is
> not locating a library that is there but that it should not use.
>
We actually do have a CI job for ATLAS on Debian too:
https://github.com/numpy/numpy/blob/36eefeabadeb4ebe07b4be3704da93a4c126b6c….
Note that Debian in addition decided to use custom pkg-config names, so it
works with `-Dblas=blas-atlas -Dlapack=lapack-atlas`.
> (b) just don't do this as a
> > packager, and (c) if you really must, you can still make it work by
> > providing a custom `cblas.pc`
>
> Well, we can also hack around in your meson build files;-) The idea is
> to prepare also custom .pc files for various cases and be able to tell
> builds to use them. It feels like a unnecessary having to prepare
> another directory with custom cblas.pc for the build that pkg-config
> then can find there compared to just telling it to use cblas-foobarbaz
> as name for the file installed in the prefix alongside others.
>
> > (see
> >
> http://scipy.github.io/devdocs/building/blas_lapack.html#using-pkg-config-t…
> > ).
>
> I'm confused … this page doesn't mention blas vs cblas. You mean in
> addition to the blas.pc used there, I need to but a cblas.pc (or
> libblas.so) beside it. This background magic is bad. There should be no
> implicit guessing, ever, for a controlled packager build.
>
Yes, it's not ideal. But again, we're talking about a hypothetical
situation, where you already did something bad as a packager, and I'm
pointing out a workaround. If you're renaming build outputs from an
upstream package from their defaults (e.g. libcblas.so ->
libcblas-foobar.so), you are breaking things and get to keep the pieces.
The correct name for Netlib BLAS is `libcblas.so` and `cblas.pc`. So
provide `cblas.pc`, and you unbreak yourself.
> > We don't use LAPACKE, so that one can be ignored.
>
> I had the impression that you are upstreaming something to meson about
> BLAS (and LAPACK) detection. So I am thinking about the full picture
> for all projects using meson with BLAS dependencies.
>
Sure, I was talking about NumPy/SciPy. In general, the LAPACKE situation is
pretty much identical to the CBLAS one.
> I see that combining any subset of BLAS, CBLAS, LAPACK, LAPACKE into a
> common library is a thing in practice. So I see how you'd want some
> special detection code to check if just loading a dependency called
> 'blas' gives you all APIs you need or of you need to add dependencies
> 'cblas', 'lapack', or 'lapacke', even.
>
> > > So scipy locates cblas based on the name blas, but doesn't really use
> > > cblas.
> >
> >
> > It does in a few places, like SuperLU.
>
> You are confusing me. I see no cblas usage anywhere in SciPy 1.11.4,
> did that come later?
>
...
>
> No cblas in sight. Also, the SuperLU part specifically was what
> prompted this whole renewed discussion. We noticed breakage since scipy
> with -Dblas=cblas failed to link _superlu.so to libblas. This only hit
> on an depending package trying to use it.
>
I'm sorry, I was just wrong here. Looking closer, the C code in SuperLU
that I thought was using CBLAS is actually manually handling Fortran
compiler symbol handling and calling the mangled Fortran symbols (it's
probably not very robust to unknown compilers):
https://github.com/scipy/scipy/blob/main/scipy/sparse/linalg/_dsolve/SuperL…
.
> The above command returns _no_ output when I build with -Dblas=cblas.
> The build works fine, but the resulting shared objects are _all_ left
> wanting an actual BLAS linkage. I provided a package that offers CBLAS
> and would pull in BLAS indirectly. Scipy build proceeds with that, but
> it tricks me with defunct algebra all around, including _superlu.so,
> where no object makes use of libcblas.
>
> You said scipy isn't using the custom meson stuff yet. I see two
> occasions of BLAS use in numpy.
>
> $ find
> /data/projekte/pkgsrc/work/math/py-numpy/work/.destdir/data/pkg/lib/python3.11/site-packages/numpy/
> -name '*.so'|while read f; do readelf -d $f|grep blas && echo "^^^^ $f
> ^^^^"; done
> 0x0000000000000001 (NEEDED) Biblioteca compartida:
> [libcblas.so.3]
> ^^^^
> /data/projekte/pkgsrc/work/math/py-numpy/work/.destdir/data/pkg/lib/python3.11/site-packages/numpy/core/_multiarray_umath.so
> ^^^^
> 0x0000000000000001 (NEEDED) Biblioteca compartida:
> [libblas.so.3]
> ^^^^
> /data/projekte/pkgsrc/work/math/py-numpy/work/.destdir/data/pkg/lib/python3.11/site-packages/numpy/linalg/_umath_linalg.so
> ^^^^
>
> So it needs both CBLAS and BLAS. (Btw. I rather like how I can tell
> that apart just from looking at the used libs, not the symbols
> themselves …;-)
>
> Building again with -Dblas=cblas … FRICK!
>
> $ find
> /data/projekte/pkgsrc/work/math/py-numpy/work/.destdir/data/pkg/lib/python3.11/site-packages/numpy/
> -name '*.so'|while read f; do readelf -d $f|grep blas && echo "^^^^ $f
> ^^^^"; done
> 0x0000000000000001 (NEEDED) Biblioteca compartida:
> [libcblas.so.3]
> ^^^^
> /data/projekte/pkgsrc/work/math/py-numpy/work/.destdir/data/pkg/lib/python3.11/site-packages/numpy/core/_multiarray_umath.so
> ^^^^
>
> Now _umath_linalg.so is left with undefined BLAS symbols, just like
> scipy before. I need to hot-fix that in pkgsrc, just after releasing
> the stable branch. Bad, bad, bad. I really took the message from you
> before that -Dblas=cblas is what I need to do. Now we ship broken numpy.
>
> Got to fix that right away.
>
> (update from second email:)
>
> Luckily, it isn't. It links to liblapack, which brings in libblas. So
> that's fine, I suppose. The test output looks the same foor both after
> installing and then testing. So this was some interference.
>
> I hope
>
> =========================== short test summary info
> ============================
> FAILED
> .destdir/data/pkg/lib/python3.11/site-packages/numpy/distutils/tests/test_log.py::test_log_prefix[info]
> FAILED
> .destdir/data/pkg/lib/python3.11/site-packages/numpy/distutils/tests/test_log.py::test_log_prefix[debug]
> 2 failed, 36807 passed, 1603 skipped, 1303 deselected, 33 xfailed, 1
> xpassed, 62 warnings in 410.43s (0:06:50)
>
>
> is OK. Those failures do sound like something basic or easy to fix,
> though. Log prefixing shouldn't be so complicated.
Yep, that should be okay - the two test failures should be pretty harmless.
> > > Numpy is happy with libcblas bringing libblas in and calls it
> > > blas, but really uses the cblas interface. This looks a bit confusing.
> > >
>
> It is happy, but not for a good result. The build just claims that it
> works, but produces defunct _umath_linalg. Or is this hidden in
> practical use by loading _multiarray_umath.so before? I'm no python
> user myself. What would be a minimal test for this?
>
Running `python -c "import numpy.linalg; numpy.test()" should do it and is
pretty fast (needs `pytest` and `hypothesis` installed). If you need to
avoid those test packages, maybe a simple `python -c "import numpy"` would
already be enough, since that does call at least one linear algebra routine.
> I may be able to add something to the docs, but there should be no
> > confusion. We need "BLAS with CBLAS symbols". CBLAS should simply not be
> > considered as a completely separate dependency (it's one library with two
> > interfaces). I can't think of a reason to do so, nor of a build system
> that
> > does it like that. There is no FindCBLAS in CMake for example, it handles
> > it transparently within FindBLAS.
>
> And it's unknown to me right now how CMake really handles CBLAS, as I
> cannot recall a CMake-using project right now that relies on CBLAS.
>
> The asymmetry with LAPACKE in Debian confuses me, too. What do you
> consider standard packaging? BLAS+CBLAS together in libblas, but
> LAPACKE as a separate thing? Inconsistent. I guess I could reduce
> pkgsrc-specifc breakage by merging libcblas into libblas. But keep
> liblapacke separate? Would that confuse more or less? Should it be all
> in one or all separate?
>
Ideally, each packager for any BLAS library should run the default upstream
build commands, install into a prefix and bundle up the result in a
package. So what is standard == whatever configure/make produces for Netlib
BLAS or OpenBLAS, without renaming or deleting anything.
> Do you consider BLAS+CBLAS to be one thing, LAPACK+LAPACKE the other?
> All or nothing?
>
Yes indeed. At least, I cannot think of any package/project that doesn't
provide CBLAS with BLAS. While BLAS and LAPACK clearly are separate from
each other, and there are projects that provide only one (e.g., BLIS is
BLAS-only).
> Wouldn't the world be much simpler with
>
> dependency(blas)
> dependency(cblas)
> dependency(lapack)
> dependency(lapacke)
>
No, I don't think so.
> as requests for the specific API mentioned there, coming as separate
> pkg-config modules by meson default behaviour if they're present, but
> some logic that checks first if whatever provided blas (-Dblas=foobar)
> also provides lapack and hence dependency(lapack) is fulfilled already?
> Otherwise, look for lapack or whatever -Dlapack=baz was set to. Or is
> this already what is happening? I didn't have time to read up on
> everything, perhaps.
>
Yes, pretty much. E.g., for MKL and Accelerate we know for sure that they
contain LAPACK symbols, so we don't even check. For OpenBLAS we know it's
the default to include LAPACK but it can be built without, so we need to
check symbols, and if they're missing we look for another library.
> If you do dependency(blas with cblas) with modified/future meson … can
> I override what you try if my -Dblas does not contain cblas, but
> another library I can provide would?
Yes, the search order and which packages are searched for is fully
customizable. But you should be good with the default `-Dblas=auto`. If
not, I'd like to understand why.
> > > And I need to ponder if I leave it at -Dblas=$CBLAS_PC for pkgsrc now.
> > > It's somewhat wrong, but also more correct, as NumPy _really_ means to
> > > use CBLAS API, not BLAS.
> > >
> >
> > My opinion is that this is not right. It may work with pkg-config in your
> > case, but I don't think it'll be robust for the fallback detection that
> is
> > used when pkg-config is not installed.
>
> See above. It's broken.
I assume that it actually was fine, given your follow-up email about the
test suite having only 2 failures. So I'll skip replying to this part.
> [...] Meson
> using CMake behind the scenes is also rather surprising. CMake configs
> are a nuisance when they take the focus away from pkg-config files …
> and a package that doesn't actually use cmake for building is normally
> not getting cmake in the build sandbox in pkgsrc. Maybe that has to be
> investigated if some packages really require meson to use cmake.
>
It's a fallback detection method to use CMake; it can be disabled indeed.
> The attached output of numpy.test() has 2 failures. I don't know if
> they are related to BLAS. Re-doing the test with a properly linked
> numpy now… if they persist — how bad is that? Are those known failures?
>
They're not a problem, it's a minor thing that's not used at runtime.
> OK … lots of errors regarding f2py in the test environment. Summary:
>
> 8 failed, 36022 passed, 1603 skipped, 1303 deselected, 33 xfailed, 1
> xpassed, 62 warnings, 779 errors in 441.09s (0:07:21)
>
> With the bad linkage, it is:
>
> 2 failed, 36807 passed, 1603 skipped, 1303 deselected, 33 xfailed, 1
> xpassed, 62 warnings in 404.95s (0:06:44)
>
> Maybe there is interference with the installed numpy. I cannot say if
> there is anything releated to the missing libblas linkage. I'd presume
> lots of actual tests would fail if that had an effect? How many tests
> need working _umath_linalg.so?
>
If something was really wrong, you'd get 100s to 1000s of failures/errors.
I think you're fine here.
Cheers,
Ralf