NAME¶
i.gensigset - Generates statistics for i.smap from raster map.
KEYWORDS¶
imagery, classification, supervised, SMAP
SYNOPSIS¶
i.gensigset
i.gensigset help
i.gensigset trainingmap=
name group=
name
subgroup=
name signaturefile=
name
[
maxsig=
integer] [--
verbose] [--
quiet]
Parameters:¶
- trainingmap=name
-
Ground truth training map
- group=name
-
Name of input imagery group
- subgroup=name
-
Name of input imagery subgroup
- signaturefile=name
-
Name for output file containing result signatures
- maxsig=integer
-
Maximum number of sub-signatures in any class
Default: 10
DESCRIPTION¶
i.gensigset is a non-interactive method for generating input into
i.smap. It is used as the first pass in the a two-pass classification
process. It reads a raster map layer, called the training map, which has some
of the pixels or regions already classified.
i.gensigset will then
extract spectral signatures from an image based on the classification of the
pixels in the training map and make these signatures available to
i.smap.
The user would then execute the GRASS program
i.smap to create
the final classified map.
OPTIONS¶
Parameters¶
- trainingmap=name
-
ground truth training map
This raster layer, supplied as input by the user, has some of its pixels already
classified, and the rest (probably most) of the pixels unclassified.
Classified means that the pixel has a non-zero value and unclassified means
that the pixel has a zero value.
This map must be prepared by the user in advance. The user must use
r.digit, a combination of
v.digit and
v.to.rast, or some
other import/developement process (e.g.,
v.in.transects) to define the
areas representative of the classes in the image.
At present, there is no fully-interactive tool specifically designed for
producing this layer.
- group=name
-
imagery group
This is the name of the group that contains the band files which comprise the
image to be analyzed. The
i.group command is used to construct groups
of raster layers which comprise an image.
- subgroup=name
-
subgroup containing image files
This names the subgroup within the group that selects a subset of the bands to
be analyzed. The
i.group command is also used to prepare this subgroup.
The subgroup mechanism allows the user to select a subset of all the band
files that form an image.
- signaturefile=name
-
resultant signature file
This is the resultant signature file (containing the means and covariance
matrices) for each class in the training map that is associated with the band
files in the subgroup selected.
- maxsig=value
-
maximum number of sub-signatures in any class
default: 10
The spectral signatures which are produced by this program are "mixed"
signatures (see NOTES). Each signature contains one or more subsignatures
(represeting subclasses). The algorithm in this program starts with a maximum
number of subclasses and reduces this number to a minimal number of subclasses
which are spectrally distinct. The user has the option to set this starting
value with this option.
INTERACTIVE MODE¶
If none of the arguments are specified on the command line,
i.gensigset
will interactively prompt for the names of these maps and files.
It should be noted that interactive mode here only means interactive prompting
for maps and files. It does not mean visualization of the signatures that
result from the process.
NOTES¶
The algorithm in
i.gensigset determines the parameters of a spectral
class model known as a Gaussian mixture distribution. The parameters are
estimated using multispectral image data and a training map which labels the
class of a subset of the image pixels. The mixture class parameters are stored
as a class signature which can be used for subsequent segmentation (i.e.,
classification) of the multispectral image.
The Gaussian mixture class is a useful model because it can be used to describe
the behavior of an information class which contains pixels with a variety of
distinct spectral characteristics. For example, forest, grasslands or urban
areas are examples of information classes that a user may wish to separate in
an image. However, each of these information classes may contain subclasses
each with its own distinctive spectral characteristic. For example, a forest
may contain a variety of different tree species each with its own spectral
behavior.
The objective of mixture classes is to improve segmentation performance by
modeling each information class as a probabilistic mixture with a variety of
subclasses. The mixture class model also removes the need to perform an
initial unsupervised segmentation for the purposes of identifying these
subclasses. However, if misclassified samples are used in the training
process, these erroneous samples may be grouped as a separate undesired
subclass. Therefore, care should be taken to provided accurate training data.
This clustering algorithm estimates both the number of distinct subclasses in
each class, and the spectral mean and covariance for each subclass. The number
of subclasses is estimated using Rissanen's minimum description length (MDL)
criteria [1]. This criteria attempts to determine the number of subclasses
which "best" describe the data. The approximate maximum likelihood
estimates of the mean and covariance of the subclasses are computed using the
expectation maximization (EM) algorithm [3].
WARNINGS¶
If warnings like this occur, reducing the remaining classes to 0:
WARNING: Removed a singular subsignature number 1 (4 remain)
WARNING: Removed a singular subsignature number 1 (3 remain)
WARNING: Removed a singular subsignature number 1 (2 remain)
WARNING: Removed a singular subsignature number 1 (1 remain)
WARNING: Unreliable clustering. Try a smaller initial number of clusters
WARNING: Removed a singular subsignature number 1 (-1 remain)
WARNING: Unreliable clustering. Try a smaller initial number of clusters
Number of subclasses is 0
then the user should check for:
- the range of the input data should be between 0 and 100 or 255 but not
between 0.0 and 1.0 ( r.info and r.univar show the
range)
- the training areas need to contain a sufficient amount of pixels
REFERENCES¶
- J. Rissanen, "A Universal Prior for Integers and Estimation by
Minimum Description Length," Annals of Statistics, vol. 11,
no. 2, pp. 417-431, 1983.
- A. Dempster, N. Laird and D. Rubin, "Maximum Likelihood from
Incomplete Data via the EM Algorithm," J. Roy. Statist. Soc.
B, vol. 39, no. 1, pp. 1-38, 1977.
- E. Redner and H. Walker, "Mixture Densities, Maximum Likelihood and
the EM Algorithm," SIAM Review, vol. 26, no. 2, April
1984.
SEE ALSO¶
i.group, i.smap, r.info, r.univar,
v.digit
AUTHORS¶
Charles Bouman, School of Electrical Engineering, Purdue University
Michael Shapiro, U.S.Army Construction Engineering Research Laboratory
Last changed: $Date: 2014-06-25 14:20:14 +0200 (Wed, 25 Jun 2014) $
Full index
© 2003-2014 GRASS Development Team