Web lists-archives.com

Towards lapack / lapack64 packaging

Hi science team,

I'm trying to add multi-flavor support to the openblas
package, as a part of the ongoing BLAS64 + LAPACK64 work.
However, there is some problems need to be discussed.

Two problems will be discussed in this email:
(1) building problem about OpenBLAS's liblapack64.so
(2) confirming details for our standard of BLAS/LAPACK virtual packages

To any other developers: If you maintain a (recursive)
reverse-dependency of libblas.so or liblapack.so, please
at least read the point 1 in section (2)
for a pitfall warning about performance.

(1) building problem about OpenBLAS's liblapack64.so

For those who are not sure what the "64" suffix in BLAS64
and LAPACK64 means:

   BLAS and LAPACK are very important numerical
   linear algebra librarries that operates contiguous
   numerical arrays.

   libblas.so and liblapack.so provides functions
   with 32-bit array indexing, e.g.

       float cblas_asum(int N, float* X, int incX);

   which calculates

       sum_{i=1}^N abs(x_i)

   However, "int" is 32-bit long on amd64.
   This simply doesn't work with arrays containing
   more than 2^31 elements. Hence we need a 64-bit
   indexing variant, for example:

       float cblas_asum(int64_t N, float* X, int64_t incX);

   Note, as pointed out by Ben long time ago, the
   correct type for pointer offset should be size_t
   or ptrdiff_t, IIRC.

   The 64-bit variants are needed by some scientific
   computing users, and packages in cluding Julia language.

Sébastien pointed out that the `liblapack64.so` library
in my implementation[1] mixed 32-bit indexing code
and 64-bit indexing code. Because

   liblapack64.so is compiled objects from:
     (1) bin:liblapack-pic (32-bit indexing static lib)
     (2) openblas's optimized lapack subset

   when I turn on the INTERFACE64=1 flag in order to
   build a 64-bit variant, the linker just mixes
   symbols from 32-bit indexing bin:liblapack-pic
   and symbols from 64-bit indexing openblas code,
   yielding a quite problematic liblapack64.so

Sébastien provided some possible solutions:

  1. build a 64-bit indexing variant of src:lapack
  2. provide a liblapack64-pic (Sébastien prefer this)

Yes, the solution *2 poses very little workload because
we just need to rebuild lapack with fortran flag "-i8".

However, I'm thinking about the 3-rd solution:

  3. disentangle src:lapack and src:openblas and just
     use src:openblas's embedded copy of src:lapack.
     (currently that embedded copy is removed from debian

This (maybe) poses even less workload to me compared to *2 .

[1] https://salsa.debian.org/science-team/openblas/tree/lumin/

(2) confirming details for our standard of BLAS/LAPACK virtual packages

Disambiguity is very important before starting this section.
Everything will definitely turn into a mess if I don't do so.
In this section, I'll use the following notations:

  * Uppercased "BLAS" means the standard BLAS API and ABI,
    fortran-based. Debian's virtual packages libblas.so and
    libblas.so.3 provide BLAS API and ABI. A typical BLAS
    symbol looks like "sasum_" (suffixed by an underscore)

  * Uppercased "CBLAS" means the c-version of the standard
    BLAS API and ABI. A typical CBLAS symbol looks like
    "cblas_sasum". (prefixed by "cblas_") The CBLAS ABI
    has been squashed into libblas.so{,.3} . It's not
    recommended to link against libcblas.so if you
    found one in the Atlas package -- which splitted
    the BLAS and CBLAS ABI into different shared objs.

  * Uppercased "LAPACK" means the standard LAPACK API and ABI,
    also fortran-based. Debian's liblapack.so and liblapack.so.3
    provides the ABI.

  * Uppercased "LAPACKE" means the c-version of the LAPACK
    API and ABI. On Debian it is shipped by bin:liblapacke,
    instead of squashed into liblapack.so (sounds a bit messy)

  * Uppercased BLAS64, CBLAS64, LAPACK64, LAPACKE64 are
    the corresponding 64-bit indexing variants.

It's important to differentiate fortran stuff from C stuff
because fortran stores array in column-major, while C in
row-major. Now let me point out some messy stuff:

1. BLAS/CBLAS packages looks relatively tidy, except Atlas
   which splitted CBLAS into a separate libcblas.so .
   That's a pitfall and numpy had ever fell into it: #913567
   Debian's Atlas is terribly slow due to ISA baseline.[2]

   Should we squash Atlas's libcblas.so back into it's
   libblas.so ? [3] Like all other alternative libraries did.

2. LAPACK and LAPACKE are well-seperated into different
   shared libraries. Sometimes LAPACKE is simply not
   built. LAPACK has been registered in the alternatives
   system: "liblapack.so", "liblapack.so.3".

   Can we confirm that it's fine to provide only LAPACK
   via liblapack.so and don't register LAPACKE in the
   alternatives system?

   If most reverse dependencies only require the fortran
   ABI (LAPACK) instead of the C ABI (LAPACKE), then I
   think it's fine to keep Debian's LAPACKE packages
   as what they are for now.

[2] That's fine. We have well-optimized implementations
    as alternatives: src:blis, src:intel-mkl, src:openblas
[3] That said, I think I'm not going to do it because
    Atlas lost my interest as it is not fast enough and
    not easy to tune.