mlpack_preprocess_binarize(26 December 2016) | mlpack_preprocess_binarize(26 December 2016) |
NAME¶
mlpack_preprocess_binarize - binarize dataSYNOPSIS¶
mlpack_preprocess_binarize [-h] [-v]
DESCRIPTION¶
This utility takes a dataset and binarizes the variables into either 0 or 1 given threshold. User can apply binarization on a dimension or the whole dataset. A dimension can be specified using --dimension (-d) option. Threshold can also be specified with the --threshold (-t) option; The default is 0.0.The program does not modify the original file, but instead makes a separate file to save the binarized data; The program requires you to specify the file name with --output_file (-o).
For example, if we want to make all variables greater than 5 in dataset to 1 and ones that are less than or equal to 5.0 to 0, and save the result to result.csv, we could run
$ mlpack_preprocess_binarize -i dataset.csv -t 5 -o result.csv
But if we want to apply this to only the first (0th) dimension of the dataset, we could run
$ mlpack_preprocess_binarize -i dataset.csv -t 5 -d 0 -o result.csv
REQUIRED INPUT OPTIONS¶
- --input_file (-i) [string]
- File containing data.
OPTIONAL INPUT OPTIONS¶
- --dimension (-d) [int]
- Dimension to apply the binarization. If not set, the program will binarize every dimension by default. Default value 0.
- --help (-h)
- Default help info.
- --info [string]
- Get help on a specific module or option. Default value ''.
- --threshold (-t) [double]
- Threshold to be applied for binarization. If not set, the threshold defaults to 0.0. Default value 0.
- --verbose (-v)
- Display informational messages and the full list of parameters and timers at the end of execution.
- --version (-V)
- Display the version of mlpack.
OPTIONAL OUTPUT OPTIONS¶
- --output_file (-o) [string]
- File to save the output. Default value ''.