Razvan Carbunescu
EECS Department
University of California, Berkeley
Technical Report No. UCB/EECS-2014-224
December 18, 2014
http://www2.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-224.pdf
Calling LAPACK, BLAS, and CBLAS Routines from C/C Language Environments. Function domains support both C and Fortran environments. Fortran-style functions in C/C environments, you should observe certain conventions, which are discussed for LAPACK and BLAS in the subsections below. Avoid calling BLAS 95/LAPACK 95 from C/C. Lapack v2.5.2 API Documentation. LAPACK is a library for high performance linear algebra computations. This version includes support for solving linear systems using LU, Cholesky, QR matrix factorizations, and symmetric eigenvalue problems.
Many applications call linear algebra libraries as methods of achieving better performance and reliability. LAPACK (Linear Algebra Package) is a standard software library for numerical linear algebra that is widely used in the industrial and scientific community. LAPACK functions require the user to know the sparsity and other mathematical structure of their inputs to be able to take advantage of the fastest codes: General Matrix (GE), General Band (GB), Positive Definite (PO) etc. If a user is unsure of their matrix structure or cannot easily express it in the formats available (profile matrices, arrow matrices etc.) they are forced to use a more general structure, which includes their input, and so run less efficiently than possible. The goal of this thesis is to allow for automatic sparsity detection (ASD) within LAPACK that is completely hidden from the user and provides no slowdown for users running fully dense matrices. This work adds modular support for the detection of blocked sparsity within LAPACK LU and Cholesky functions. It also creates the infrastructure and the algorithms to potentially expand sparsity detection to other factorizations, more input matrix structures, or provide further timing and memory improvements via integration directly in the solver routines. Two general approaches are implemented named `Profile' (ASD1) and `Sparse block' (ASD2) with a third more complicated method named `Full sparsity' (ASD3) being described more abstractly, only at an algorithm level. With these algorithms we obtain benefits of up to an order of magnitude (35.10x faster over the same LAPACK function) for matrices displaying `blocked sparsity' patterns and large benefits over the best LAPACK algorithms for patterns that don't fit into LAPACK categories (4.85x faster over the best LAPACK function). For matrices exhibiting no sparsity these implementations incur either a negligible penalty (an overhead of 1%) or incur a small overhead (10-30%) that quickly decreases with the size of matrix n or band b (less than 5% for n,b > 500).
Advisor: James Demmel
BibTeX citation:
EndNote citation:
Paul R. Willems, Bruno Lang and Christof Voemel
Lapack Library Not Found
EECS Department
University of California, Berkeley
Technical Report No. UCB/CSD-05-1376
2005
http://www2.eecs.berkeley.edu/Pubs/TechRpts/2005/CSD-05-1376.pdf
We describe the design and implementation of a new algorithm for computing the singular value decomposition of a real bidiagonal matrix. This algorithm uses ideas developed by Grosser and Lang that extend Parlett's and Dhillon's MRRR algorithm for the tridiagonal symmetric eigenproblem. One key feature of our new implementation is, that k singular triplets can be computed using only O(nk) storage units and floating point operations, where n is the dimension of the matrix. The algorithm will be made available as routine xBDSCR in the upcoming new release of the LAPACK library.
BibTeX citation:
Lapack Library Ubuntu
EndNote citation: