Scroll to navigation

pksvm(1) pksvm(1)

NAME

pksvm - classify raster image using Support Vector Machine

SYNOPSIS


pksvm
-t training [-i input] [-o output] [-cv value] [options] [advanced options]

DESCRIPTION

pksvm implements a support vector machine (SVM) to solve a supervised classification problem. The implementation is based on the open source C++ library libSVM (http://www.csie.ntu.edu.tw/~cjlin/libsvm). Both raster and vector files are supported as input. The output will contain the classification result, either in raster or vector format, corresponding to the format of the input. A training sample must be provided as an OGR vector dataset that contains the class labels and the features for each training point. The point locations are not considered in the training step. You can use the same training sample for classifying different images, provided the number of bands of the images are identical. Use the utility pkextract to create a suitable training sample, based on a sample of points or polygons. For raster output maps you can attach a color table using the option -ct.

OPTIONS

Training vector file. A single vector file contains all training features (must be set as: b0, b1, b2,...) for all classes (class numbers identified by label option). Use multiple training files for bootstrap aggregation (alternative to the --bag and --bagsize options, where a random subset is taken from a single training file)
input image
Output classification image
N-fold cross validation mode (default: 0)
Training layer name(s)
List of class names.
List of class values (use same order as in --class option).
Output image format (see also gdal_translate(1)).
Output ogr format for active training sample
Creation option for output file. Multiple options can be specified.
Color table in ASCII format having 5 columns: id R G B ALFA (0: transparent, 255: solid)
Identifier for class label in training vector file. (default: label)
Prior probabilities for each class (e.g., -prior 0.3 -prior 0.3 -prior 0.2) Used for input only (ignored for cross validation)
Gamma in kernel function
The parameter C of C_SVC, epsilon_SVR, and nu_SVR
Only classify within specified mask (vector or raster). For raster mask, set nodata values with the option --msknodata.
Mask value(s) not to consider for classification. Values will be taken over in classification image.
Nodata value to put where image is masked as nodata
Verbose level

Advanced options

Band index (starting from 0, either use --band option or use --startband to --endband)
Start band sequence number
End band sequence number
Balance the input data to this number of samples for each class
If number of training pixels is less then min, do not take this class into account (0: consider all classes)
Number of bootstrap aggregations (default is no bagging: 1)
Percentage of features used from available training features for each bootstrap aggregation (one size for all classes, or a different size for each class respectively
How to combine bootstrap aggregation classifiers (0: sum rule, 1: product rule, 2: max rule). Also used to aggregate classes with rc option.
Output for each individual bootstrap aggregation
Probability image.
Offset value for each spectral band input features: refl[band]=(DN[band]-offset[band])/scale[band]
Scale value for each spectral band input features: refl=(DN[band]-offset[band])/scale[band] (use 0 if scale min and max in each band to -1.0 and 1.0)
Type of SVM (C_SVC, nu_SVC,one_class, epsilon_SVR, nu_SVR)
Type of kernel function (linear,polynomial,radial,sigmoid)
Degree in kernel function
Coef0 in kernel function
The parameter nu of nu-SVC, one-class SVM, and nu-SVR
The epsilon in loss function of epsilon-SVR
Cache ⟨http://pktools.nongnu.org/html/classCache.html⟩ memory size in MB (default: 100)
the tolerance of termination criterion (default: 0.001)
Whether to use the shrinking heuristics
Number of active training points

EXAMPLE

Classify input image input.tif with a support vector machine. A training sample that is provided as an OGR vector dataset. It contains all features (same dimensionality as input.tif) in its fields (please check pkextract(1) on how to obtain such a file from a "clean" vector file containing locations only). A two-fold cross validation (cv) is performed (output on screen). The parameters cost and gamma of the support vector machine are set to 1000 and 0.1 respectively. A colourtable (a five column text file: image value, RED, GREEN, BLUE, ALPHA) has also been provided.

pksvm -i input.tif -t training.sqlite -o output.tif -cv 2 -ct colourtable.txt -cc 1000 -g 0.1

Classification using bootstrap aggregation. The training sample is randomly split in three subsamples (33% of the original sample each).

pksvm -i input.tif -t training.sqlite -o output.tif -bs 33 -bag 3

Classification using prior probabilities for each class. The priors are automatically normalized. The order in which the options -p are provide should respect the alphanumeric order of the class names (class 10 comes before 2...)

pksvm -i input.tif -t training.sqlite -o output.tif -p 1 -p 1 -p 1 -p 1 -p 1 -p 1 -p 1 -p 1 -p 1 -p 1 -p 1 -p 0.2 -p 1 -p 1 -p 1

06 December 2020