'\" -*- coding: us-ascii -*- .if \n(.g .ds T< \\FC .if \n(.g .ds T> \\F[\n[.fam]] .de URL \\$2 \(la\\$1\(ra\\$3 .. .if \n(.g .mso www.tmac .TH pkfssvm 1 "14 June 2016" "" "" .SH NAME pkfssvm \- feature selection for nn classifier .SH SYNOPSIS 'nh .fi .ad l \fBpkfssvm\fR \kx .if (\nx>(\n(.l/2)) .nr x (\n(.l/5) 'in \n(.iu+\nxu \fB-t\fR \fItraining\fR \fB-n\fR \fInumber\fR [\fIoptions\fR] [\fIadvanced options\fR] 'in \n(.iu-\nxu .ad b 'hy .SH DESCRIPTION Classification problems dealing with high dimensional input data can be challenging due to the Hughes phenomenon. Hyperspectral data, for instance, can have hundreds of spectral bands and require special attention when being classified. In particular when limited training data are available, the classification of such data can be problematic without reducing the dimension. .PP The SVM classifier has been shown to be more robust to this type of problem than others. Nevertheless, classification accuracy can often be improved with feature selection methods. The utility pkfssvm implements a number of feature selection techniques, among which a sequential floating forward search (SFFS). .SH OPTIONS .TP \*(T<\fB\-t\fR\*(T> \fIfilename\fR, \*(T<\fB\-\-training\fR\*(T> \fIfilename\fR training vector file. A single vector file contains all training features (must be set as: B0, B1, B2,...) for all classes (class numbers identified by label option). Use multiple training files for bootstrap aggregation (alternative to the bag and bsize options, where a random subset is taken from a single training file) .TP \*(T<\fB\-n\fR\*(T> \fInumber\fR, \*(T<\fB\-\-nf\fR\*(T> \fInumber\fR number of features to select (0 to select optimal number, see also \*(T<\fB\-\-ecost\fR\*(T> option) .TP \*(T<\fB\-i\fR\*(T> \fIfilename\fR, \*(T<\fB\-\-input\fR\*(T> \fIfilename\fR input test set (leave empty to perform a cross validation based on training only) .TP \*(T<\fB\-v\fR\*(T> \fIlevel\fR, \*(T<\fB\-\-verbose\fR\*(T> \fIlevel\fR set to: 0 (results only), 1 (confusion matrix), 2 (debug) .PP Advanced options .TP \*(T<\fB\-tln\fR\*(T> \fIlayer\fR, \*(T<\fB\-\-tln\fR\*(T> \fIlayer\fR training layer name(s) .TP \*(T<\fB\-label\fR\*(T> \fIattribute\fR, \*(T<\fB\-\-label\fR\*(T> \fIattribute\fR identifier for class label in training vector file. (default: label) .TP \*(T<\fB\-bal\fR\*(T> \fIsize\fR, \*(T<\fB\-\-balance\fR\*(T> \fIsize\fR balance the input data to this number of samples for each class (default: 0) .TP \*(T<\fB\-random\fR\*(T>, \*(T<\fB\-\-random\fR\*(T> in case of balance, randomize input data .TP \*(T<\fB\-min\fR\*(T> \fInumber\fR, \*(T<\fB\-\-min\fR\*(T> \fInumber\fR if number of training pixels is less then min, do not take this class into account .TP \*(T<\fB\-b\fR\*(T> \fIband\fR, \*(T<\fB\-\-band\fR\*(T> \fIband\fR band index (starting from 0, either use band option or use start to end) .TP \*(T<\fB\-sband\fR\*(T> \fIband\fR, \*(T<\fB\-\-startband\fR\*(T> \fIband\fR start band sequence number .TP \*(T<\fB\-eband\fR\*(T> \fIband\fR, \*(T<\fB\-\-endband\fR\*(T> \fIband\fR end band sequence number .TP \*(T<\fB\-offset\fR\*(T> \fIvalue\fR, \*(T<\fB\-\-offset\fR\*(T> \fIvalue\fR offset value for each spectral band input features: \*(T .TP \*(T<\fB\-scale\fR\*(T> \fIvalue\fR, \*(T<\fB\-\-scale\fR\*(T> \fIvalue\fR scale value for each spectral band input features: \*(T (use \*(T<0\*(T> if scale min and max in each band to \*(T<\-1.0\*(T> and \*(T<1.0\*(T>) .TP \*(T<\fB\-svmt\fR\*(T> \fItype\fR, \*(T<\fB\-\-svmtype\fR\*(T> \fItype\fR type of SVM (C_SVC, nu_SVC,one_class, epsilon_SVR, nu_SVR) .TP \*(T<\fB\-kt\fR\*(T> \fItype\fR, \*(T<\fB\-\-kerneltype\fR\*(T> \fItype\fR type of kernel function (linear,polynomial,radial,sigmoid) .TP \*(T<\fB\-kd\fR\*(T> \fIvalue\fR, \*(T<\fB\-\-kd\fR\*(T> \fIvalue\fR degree in kernel function .TP \*(T<\fB\-g\fR\*(T> \fIvalue\fR, \*(T<\fB\-\-gamma\fR\*(T> \fIvalue\fR gamma in kernel function .TP \*(T<\fB\-c0\fR\*(T> \fIvalue\fR, \*(T<\fB\-\-coef0\fR\*(T> \fIvalue\fR coef0 in kernel function .TP \*(T<\fB\-cc\fR\*(T> \fIvalue\fR, \*(T<\fB\-\-ccost\fR\*(T> \fIvalue\fR the parameter C of C-SVC, epsilon-SVR, and nu-SVR .TP \*(T<\fB\-nu\fR\*(T> \fIvalue\fR, \*(T<\fB\-\-nu\fR\*(T> \fIvalue\fR the parameter nu of nu-SVC, one-class SVM, and nu-SVR .TP \*(T<\fB\-eloss\fR\*(T> \fIvalue\fR, \*(T<\fB\-\-eloss\fR\*(T> \fIvalue\fR the epsilon in loss function of epsilon-SVR .TP \*(T<\fB\-cache\fR\*(T> \fInumber\fR, \*(T<\fB\-\-cache\fR\*(T> \fInumber\fR cache memory size in MB (default: 100) .TP \*(T<\fB\-etol\fR\*(T> \fIvalue\fR, \*(T<\fB\-\-etol\fR\*(T> \fIvalue\fR the tolerance of termination criterion (default: 0.001) .TP \*(T<\fB\-shrink\fR\*(T>, \*(T<\fB\-\-shrink\fR\*(T> whether to use the shrinking heuristics .TP \*(T<\fB\-sm\fR\*(T> \fImethod\fR, \*(T<\fB\-\-sm\fR\*(T> \fImethod\fR feature selection method (sffs=sequential floating forward search, sfs=sequential forward search, sbs, sequential backward search, bfs=brute force search) .TP \*(T<\fB\-ecost\fR\*(T> \fIvalue\fR, \*(T<\fB\-\-ecost\fR\*(T> \fIvalue\fR epsilon for stopping criterion in cost function to determine optimal number of features .TP \*(T<\fB\-cv\fR\*(T> \fIvalue\fR, \*(T<\fB\-\-cv\fR\*(T> \fIvalue\fR n-fold cross validation mode (default: 0) .TP \*(T<\fB\-c\fR\*(T> \fIname\fR, \*(T<\fB\-\-class\fR\*(T> \fIname\fR list of class names. .TP \*(T<\fB\-r\fR\*(T> \fIvalue\fR, \*(T<\fB\-\-reclass\fR\*(T> \fIvalue\fR list of class values (use same order as in \*(T<\fB\-\-class\fR\*(T> option). .SH "SEE ALSO" \fBpkfsann\fR(1)