.\" -*- mode: troff; coding: utf-8 -*- .\" Automatically generated by Pod::Man 5.01 (Pod::Simple 3.43) .\" .\" Standard preamble: .\" ======================================================================== .de Sp \" Vertical space (when we can't use .PP) .if t .sp .5v .if n .sp .. .de Vb \" Begin verbatim text .ft CW .nf .ne \\$1 .. .de Ve \" End verbatim text .ft R .fi .. .\" \*(C` and \*(C' are quotes in nroff, nothing in troff, for use with C<>. .ie n \{\ . ds C` "" . ds C' "" 'br\} .el\{\ . ds C` . ds C' 'br\} .\" .\" Escape single quotes in literal strings from groff's Unicode transform. .ie \n(.g .ds Aq \(aq .el .ds Aq ' .\" .\" If the F register is >0, we'll generate index entries on stderr for .\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index .\" entries marked with X<> in POD. Of course, you'll have to process the .\" output yourself in some meaningful fashion. .\" .\" Avoid warning from groff about undefined register 'F'. .de IX .. .nr rF 0 .if \n(.g .if rF .nr rF 1 .if (\n(rF:(\n(.g==0)) \{\ . if \nF \{\ . de IX . tm Index:\\$1\t\\n%\t"\\$2" .. . if !\nF==2 \{\ . nr % 0 . nr F 2 . \} . \} .\} .rr rF .\" ======================================================================== .\" .IX Title "MatrixOps 3pm" .TH MatrixOps 3pm 2024-05-17 "perl v5.38.2" "User Contributed Perl Documentation" .\" For nroff, turn off justification. Always turn off hyphenation; it makes .\" way too many mistakes in technical documents. .if n .ad l .nh .SH NAME PDL::CCS::MatrixOps \- Low\-level matrix operations for compressed storage sparse PDLs .SH SYNOPSIS .IX Header "SYNOPSIS" .Vb 2 \& use PDL; \& use PDL::CCS::MatrixOps; \& \& ##\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\- \& ## ... stuff happens .Ve .SH FUNCTIONS .IX Header "FUNCTIONS" .SS ccs_matmult2d_sdd .IX Subsection "ccs_matmult2d_sdd" .Vb 6 \& Signature: ( \& indx ixa(NdimsA,NnzA); nza(NnzA); missinga(); \& b(O,M); \& zc(O); \& [o]c(O,N) \& ) .Ve .PP Two-dimensional matrix multiplication of a sparse index-encoded PDL $a() with a dense pdl $b(), with output to a dense pdl $c(). .PP The sparse input PDL $a() should be passed here with 0th dimension "M" and 1st dimension "N", just as for the built-in \fBPDL::Primitive::matmult()\fR. .PP "Missing" values in $a() are treated as $\fBmissinga()\fR, which shouldn't be BAD or infinite, but otherwise ought to be handled correctly. The input pdl $\fBzc()\fR is used to pass the cached contribution of a $\fBmissinga()\fR\-row ("M") to an output column ("O"), i.e. .PP .Vb 1 \& $zc = ((zeroes($M,1)+$missinga) x $b)\->flat; .Ve .PP \&\f(CW$SIZ\fRE(Ndimsa) is assumed to be 2. .PP ccs_matmult2d_sdd does not process bad values. It will set the bad-value flag of all output ndarrays if the flag is set for any of the input ndarrays. .SS ccs_matmult2d_zdd .IX Subsection "ccs_matmult2d_zdd" .Vb 5 \& Signature: ( \& indx ixa(Ndimsa,NnzA); nza(NnzA); \& b(O,M); \& [o]c(O,N) \& ) .Ve .PP Two-dimensional matrix multiplication of a sparse index-encoded PDL $a() with a dense pdl $b(), with output to a dense pdl $c(). .PP The sparse input PDL $a() should be passed here with 0th dimension "M" and 1st dimension "N", just as for the built-in \fBPDL::Primitive::matmult()\fR. .PP "Missing" values in $a() are treated as zero. \&\f(CW$SIZ\fRE(Ndimsa) is assumed to be 2. .PP ccs_matmult2d_zdd does not process bad values. It will set the bad-value flag of all output ndarrays if the flag is set for any of the input ndarrays. .SS ccs_vnorm .IX Subsection "ccs_vnorm" .Vb 4 \& Signature: ( \& indx acols(NnzA); avals(NnzA); \& float+ [o]vnorm(M); \& ; int sizeM=>M) .Ve .PP Computes the Euclidean lengths of each column-vector $a(i,*) of a sparse index-encoded pdl $a() of logical dimensions (M,N), with output to a dense piddle $\fBvnorm()\fR. "Missing" values in $a() are treated as zero, and $\fBacols()\fR specifies the (unsorted) indices along the logical dimension M of the corresponding non-missing values in $\fBavals()\fR. This is basically the same thing as: .PP .Vb 1 \& $vnorm = ($a**2)\->xchg(0,1)\->sumover\->sqrt; .Ve .PP \&... but should be must faster to compute for sparse index-encoded piddles. .PP \&\fBccs_vnorm()\fR always clears the bad-status flag on $\fBvnorm()\fR. .SS ccs_vcos_zdd .IX Subsection "ccs_vcos_zdd" .Vb 7 \& Signature: ( \& indx ixa(2,NnzA); nza(NnzA); \& b(N); \& float+ [o]vcos(M); \& float+ [t]anorm(M); \& int sizeM=>M; \& ) .Ve .PP Computes the vector cosine similarity of a dense row-vector $b(N) with respect to each column $a(i,*) of a sparse index-encoded PDL $a() of logical dimensions (M,N), with output to a dense piddle \&\f(CW$vco\fRs(M). "Missing" values in $a() are treated as zero, and magnitudes for $a() are passed in the optional parameter $\fBanorm()\fR, which will be implicitly computed using ccs_vnorm if the $\fBanorm()\fR parameter is omitted or empty. This is basically the same thing as: .PP .Vb 2 \& $anorm //= ($a**2)\->xchg(0,1)\->sumover\->sqrt; \& $vcos = ($a * $b\->slice("*1,"))\->xchg(0,1)\->sumover / ($anorm * ($b**2)\->sumover\->sqrt); .Ve .PP \&... but should be must faster to compute. .PP Output values in $\fBvcos()\fR are cosine similarities in the range [\-1,1], except for zero-magnitude vectors which will result in NaN values in $\fBvcos()\fR. If you need non-negative distances, follow this up with a: .PP .Vb 2 \& $vcos\->minus(1,$vcos,1) \& $vcos\->inplace\->setnantobad\->inplace\->setbadtoval(0); ##\-\- minimum distance for NaN values .Ve .PP to get distances values in the range [0,2]. You can use PDL threading to batch-compute distances for multiple $b() vectors simultaneously: .PP .Vb 2 \& $bx = random($N, $NB); ##\-\- get $NB random vectors of size $N \& $vcos = ccs_vcos_zdd($ixa,$nza, $bx, $M); ##\-\- $vcos is now ($M,$NB) .Ve .PP \&\fBccs_vcos_zdd()\fR always clears the bad status flag on the output piddle \f(CW$vcos\fR. .SS _ccs_vcos_zdd .IX Subsection "_ccs_vcos_zdd" .Vb 5 \& Signature: ( \& indx ixa(Two,NnzA); nza(NnzA); \& b(N); \& float+ anorm(M); \& float+ [o]vcos(M);) .Ve .PP Guts for \fBccs_vcos_zdd()\fR, with slightly different calling conventions. .PP Always clears the bad status flag on the output piddle \f(CW$vcos\fR. .SS ccs_vcos_pzd .IX Subsection "ccs_vcos_pzd" .Vb 5 \& Signature: ( \& indx aptr(Nplus1); indx acols(NnzA); avals(NnzA); \& indx brows(NnzB); bvals(NnzB); \& anorm(M); \& float+ [o]vcos(M);) .Ve .PP Computes the vector cosine similarity of a sparse index-encoded row-vector $b() of logical dimension (N) with respect to each column $a(i,*) a sparse Harwell-Boeing row-encoded PDL $a() of logical dimensions (M,N), with output to a dense piddle \f(CW$vco\fRs(M). "Missing" values in $a() are treated as zero, and magnitudes for $a() are passed in the obligatory parameter $\fBanorm()\fR. Usually much faster than \fBccs_vcos_zdd()\fR if a CRS pointer over logical dimension (N) is available for $a(). .PP \&\fBccs_vcos_pzd()\fR always clears the bad status flag on the output piddle \f(CW$vcos\fR. .SH ACKNOWLEDGEMENTS .IX Header "ACKNOWLEDGEMENTS" Perl by Larry Wall. .PP PDL by Karl Glazebrook, Tuomas J. Lukka, Christian Soeller, and others. .SH "KNOWN BUGS" .IX Header "KNOWN BUGS" We should really implement matrix multiplication in terms of inner product, and have a good sparse-matrix only implementation of the former. .SH AUTHOR .IX Header "AUTHOR" Bryan Jurish .SS "Copyright Policy" .IX Subsection "Copyright Policy" All other parts Copyright (C) 2009\-2024, Bryan Jurish. All rights reserved. .PP This package is free software, and entirely without warranty. You may redistribute it and/or modify it under the same terms as Perl itself. .SH "SEE ALSO" .IX Header "SEE ALSO" \&\fBperl\fR\|(1), \fBPDL\fR\|(3perl)