User Commands

NAME¶

mlpack_nmf - non-negative matrix factorization

SYNOPSIS¶

 mlpack_nmf -i string -r int [-q string] [-p string] [-m int] [-e double] [-s int] [-u string] [-V bool] [-H string] [-W string] [-h -v]

DESCRIPTION¶

This program performs non-negative matrix factorization on the given dataset, storing the resulting decomposed matrices in the specified files. For an input dataset V, NMF decomposes V into two matrices W and H such that

V = W * H

where all elements in W and H are non-negative. If V is of size (n x m), then W will be of size (n x r) and H will be of size (r x m), where r is the rank of the factorization (specified by the '--rank (-r)' parameter).

Optionally, the desired update rules for each NMF iteration can be chosen from the following list:

multdist: multiplicative distance-based update rules (Lee and Seung 1999)
multdiv: multiplicative divergence-based update rules (Lee and Seung 1999)
als: alternating least squares update rules (Paatero and Tapper 1994)

The maximum number of iterations is specified with '--max_iterations (-m)', and the minimum residue required for algorithm termination is specified with the '--min_residue (-e)' parameter.

For example, to run NMF on the input matrix 'V.csv' using the 'multdist' update rules with a rank-10 decomposition and storing the decomposed matrices into 'W.csv' and 'H.csv', the following command could be used:

$ nmf --input_file V.csv --w_file W.csv --h_file H.csv --rank 10 --update_rules multdist

REQUIRED INPUT OPTIONS¶

--input_file (-i) [string]: Input dataset to perform NMF on.
--rank (-r) [int]: Rank of the factorization.

OPTIONAL INPUT OPTIONS¶

--help (-h) [bool]: Default help info.
--info [string]: Get help on a specific module or option. Default value ''.
--initial_h_file (-q) [string]: Initial H matrix. Default value ''.
--initial_w_file (-p) [string]: Initial W matrix. Default value ''.
--max_iterations (-m) [int]: Number of iterations before NMF terminates (0 runs until convergence. Default value 10000.
--min_residue (-e) [double]: The minimum root mean square residue allowed for each iteration, below which the program terminates. Default value 1e-05.
--seed (-s) [int]: Random seed. If 0, 'std::time(NULL)' is used. Default value 0. --update_rules (-u) [string] Update rules for each iteration; ( multdist | multdiv | als ). Default value 'multdist'.
--verbose (-v) [bool]: Display informational messages and the full list of parameters and timers at the end of execution.
--version (-V) [bool]: Display the version of mlpack.

OPTIONAL OUTPUT OPTIONS¶

--h_file (-H) [string]: Matrix to save the calculated H to. Default value ''.
--w_file (-W) [string]: Matrix to save the calculated W to. Default value ''.

ADDITIONAL INFORMATION¶

For further information, including relevant papers, citations, and theory, consult the documentation found at http://www.mlpack.org or included with your distribution of mlpack.

18 November 2018

mlpack-3.0.4

Source file:	mlpack_nmf.1.en.gz (from mlpack-bin 3.0.4-1)
Source last updated:	2018-11-18T22:45:44Z
Converted to HTML:	2021-02-06T21:11:34Z