NAME¶
PSGESVD - compute the singular value decomposition (SVD) of an M-by-N matrix A,
optionally computing the left and/or right singular vectors
SYNOPSIS¶
- SUBROUTINE PSGESVD(
- JOBU, JOBVT, M, N, A, IA, JA, DESCA, S, U, IU, JU, DESCU,
VT, IVT, JVT, DESCVT, WORK, LWORK, INFO )
- CHARACTER
- JOBU, JOBVT
- INTEGER
- IA, INFO, IU, IVT, JA, JU, JVT, LWORK, M, N
- INTEGER
- DESCA( * ), DESCU( * ), DESCVT( * )
- REAL
- A( * ), S( * ), U( * ), VT( * ), WORK( * )
PURPOSE¶
PSGESVD computes the singular value decomposition (SVD) of an M-by-N matrix A,
optionally computing the left and/or right singular vectors. The SVD is
written as
A = U * SIGMA * transpose(V)
where SIGMA is an M-by-N matrix which is zero except for its min(M,N) diagonal
elements, U is an M-by-M orthogonal matrix, and V is an N-by-N orthogonal
matrix. The diagonal elements of SIGMA are the singular values of A and the
columns of U and V are the corresponding right and left singular vectors,
respectively. The singular values are returned in array S in decreasing order
and only the first min(M,N) columns of U and rows of VT = V**T are computed.
Notes
=====
Each global data object is described by an associated description vector. This
vector stores the information required to establish the mapping between an
object element and its corresponding process and memory location.
Let A be a generic term for any 2D block cyclicly distributed array. Such a
global array has an associated description vector DESCA. In the following
comments, the character _ should be read as "of the global array".
NOTATION STORED IN EXPLANATION
--------------- -------------- --------------------------------------
DTYPE_A(global) DESCA( DTYPE_ )The descriptor type. In this case,
DTYPE_A = 1.
CTXT_A (global) DESCA( CTXT_ ) The BLACS context handle, indicating
the BLACS process grid A is distribu-
ted over. The context itself is glo-
bal, but the handle (the integer
value) may vary.
M_A (global) DESCA( M_ ) The number of rows in the global
array A.
N_A (global) DESCA( N_ ) The number of columns in the global
array A.
MB_A (global) DESCA( MB_ ) The blocking factor used to distribute
the rows of the array.
NB_A (global) DESCA( NB_ ) The blocking factor used to distribute
the columns of the array.
RSRC_A (global) DESCA( RSRC_ ) The process row over which the first
row of the array A is distributed. CSRC_A (global) DESCA( CSRC_ ) The process
column over which the
first column of the array A is
distributed.
LLD_A (local) DESCA( LLD_ ) The leading dimension of the local
array. LLD_A >= MAX(1,LOCr(M_A)).
Let K be the number of rows or columns of a distributed matrix, and assume that
its process grid has dimension p x q. LOCr( K ) denotes the number of elements
of K that a process would receive if K were distributed over the p processes
of its process column. Similarly, LOCc( K ) denotes the number of elements of
K that a process would receive if K were distributed over the q processes of
its process row. The values of LOCr() and LOCc() may be determined via a call
to the ScaLAPACK tool function, NUMROC:
LOCr( M ) = NUMROC( M, MB_A, MYROW, RSRC_A, NPROW ),
LOCc( N ) = NUMROC( N, NB_A, MYCOL, CSRC_A, NPCOL ). An upper bound for these
quantities may be computed by:
LOCr( M ) <= ceil( ceil(M/MB_A)/NPROW )*MB_A
LOCc( N ) <= ceil( ceil(N/NB_A)/NPCOL )*NB_A
ARGUMENTS¶
MP = number of local rows in A and U NQ = number of local columns in A and VT
SIZE = min( M, N ) SIZEQ = number of local columns in U SIZEP = number of
local rows in VT
- JOBU (global input) CHARACTER*1
- Specifies options for computing all or part of the matrix
U:
= 'V': the first SIZE columns of U (the left singular vectors) are returned
in the array U; = 'N': no columns of U (no left singular vectors) are
computed.
- JOBVT (global input) CHARACTER*1
- Specifies options for computing all or part of the matrix
V**T:
= 'V': the first SIZE rows of V**T (the right singular vectors) are returned
in the array VT; = 'N': no rows of V**T (no right singular vectors) are
computed.
- M (global input) INTEGER
- The number of rows of the input matrix A. M >= 0.
- N (global input) INTEGER
- The number of columns of the input matrix A. N >=
0.
- A (local input/workspace) block cyclic REAL array,
- global dimension (M, N), local dimension (MP, NQ) On exit,
the contents of A are destroyed.
- IA (global input) INTEGER
- The row index in the global array A indicating the first
row of sub( A ).
- JA (global input) INTEGER
- The column index in the global array A indicating the first
column of sub( A ).
- DESCA (global input) INTEGER array of dimension DLEN_
- The array descriptor for the distributed matrix A.
- S (global output) REAL array, dimension SIZE
- The singular values of A, sorted so that S(i) >=
S(i+1).
- U (local output) REAL array, local dimension
- (MP, SIZEQ), global dimension (M, SIZE) if JOBU = 'V', U
contains the first min(m,n) columns of U if JOBU = 'N', U is not
referenced.
- IU (global input) INTEGER
- The row index in the global array U indicating the first
row of sub( U ).
- JU (global input) INTEGER
- The column index in the global array U indicating the first
column of sub( U ).
- DESCU (global input) INTEGER array of dimension DLEN_
- The array descriptor for the distributed matrix U.
- VT (local output) REAL array, local dimension
- (SIZEP, NQ), global dimension (SIZE, N). If JOBVT = 'V', VT
contains the first SIZE rows of V**T. If JOBVT = 'N', VT is not
referenced.
- IVT (global input) INTEGER
- The row index in the global array VT indicating the first
row of sub( VT ).
- JVT (global input) INTEGER
- The column index in the global array VT indicating the
first column of sub( VT ).
- DESCVT (global input) INTEGER array of dimension DLEN_
- The array descriptor for the distributed matrix VT.
- WORK (local workspace/output) REAL array, dimension
- (LWORK) On exit, if INFO = 0, WORK(1) returns the optimal
LWORK;
- LWORK (local input) INTEGER
- The dimension of the array WORK.
LWORK > 2 + 6*SIZEB + MAX(WATOBD, WBDTOSVD),
where SIZEB = MAX(M,N), and WATOBD and WBDTOSVD refer, respectively, to the
workspace required to bidiagonalize the matrix A and to go from the
bidiagonal matrix to the singular value decomposition U*S*VT.
For WATOBD, the following holds:
WATOBD = MAX(MAX(WPSLANGE,WPSGEBRD), MAX(WPSLARED2D,WPSLARED1D)),
where WPSLANGE, WPSLARED1D, WPSLARED2D, WPSGEBRD are the workspaces required
respectively for the subprograms PSLANGE, PSLARED1D, PSLARED2D, PSGEBRD.
Using the standard notation
MP = NUMROC( M, MB, MYROW, DESCA( CTXT_ ), NPROW), NQ = NUMROC( N, NB,
MYCOL, DESCA( LLD_ ), NPCOL),
the workspaces required for the above subprograms are
WPSLANGE = MP, WPSLARED1D = NQ0, WPSLARED2D = MP0, WPSGEBRD = NB*(MP + NQ +
1) + NQ,
where NQ0 and MP0 refer, respectively, to the values obtained at MYCOL = 0
and MYROW = 0. In general, the upper limit for the workspace is given by a
workspace required on processor (0,0):
WATOBD <= NB*(MP0 + NQ0 + 1) + NQ0.
In case of a homogeneous process grid this upper limit can be used as an
estimate of the minimum workspace for every processor.
For WBDTOSVD, the following holds:
WBDTOSVD = SIZE*(WANTU*NRU + WANTVT*NCVT) + MAX(WDBDSQR,
MAX(WANTU*WPSORMBRQLN, WANTVT*WPSORMBRPRT)),
- where
-
1, if left(right) singular vectors are wanted WANTU(WANTVT) = 0, otherwise
and WDBDSQR, WPSORMBRQLN and WPSORMBRPRT refer respectively to the workspace
required for the subprograms DBDSQR, PSORMBR(QLN), and PSORMBR(PRT), where
QLN and PRT are the values of the arguments VECT, SIDE, and TRANS in the
call to PSORMBR. NRU is equal to the local number of rows of the matrix U
when distributed 1-dimensional "column" of processes.
Analogously, NCVT is equal to the local number of columns of the matrix VT
when distributed across 1-dimensional "row" of processes.
Calling the LAPACK procedure DBDSQR requires
WDBDSQR = MAX(1, 2*SIZE + (2*SIZE - 4)*MAX(WANTU, WANTVT))
on every processor. Finally,
WPSORMBRQLN = MAX( (NB*(NB-1))/2, (SIZEQ+MP)*NB)+NB*NB, WPSORMBRPRT = MAX(
(MB*(MB-1))/2, (SIZEP+NQ)*MB )+MB*MB,
If LIWORK = -1, then LIWORK is global input and a workspace query is
assumed; the routine only calculates the minimum size for the work array.
The required workspace is returned as the first element of WORK and no
error message is issued by PXERBLA.
- INFO (output) INTEGER
- = 0: successful exit.
< 0: if INFO = -i, the i-th argument had an illegal value.
> 0: if SBDSQR did not converge If INFO = MIN(M,N) + 1, then PSSYEV has
detected heterogeneity by finding that eigenvalues were not identical
across the process grid. In this case, the accuracy of the results from
PSSYEV cannot be guaranteed.