NAME¶
i.cluster - Generates spectral signatures for land cover types in
an image using a clustering algorithm.
The resulting signature file is used as input for i.maxlik, to generate an
unsupervised image classification.
KEYWORDS¶
imagery, classification, signatures
SYNOPSIS¶
i.cluster
i.cluster help
i.cluster [-
q]
group=
name
subgroup=
name sigfile=
name
classes=
integer [
seed=
name]
[
sample=
row_interval,col_interval] [
iterations=
integer] [
convergence=
float]
[
separation=
float] [
min_size=
integer]
[
reportfile=
name] [--
verbose] [--
quiet]
Flags:¶
- -q
-
Quiet
- --verbose
-
Verbose module output
- --quiet
-
Quiet module output
Parameters:¶
- group=name
-
Name of input imagery group
- subgroup=name
-
Name of input imagery subgroup
- sigfile=name
-
Name for output file containing result signatures
- classes=integer
-
Initial number of classes
Options: 1-255
- seed=name
-
Name of file containing initial signatures
- sample=row_interval,col_interval
-
Sampling intervals (by row and col); default: ~10,000 pixels
- iterations=integer
-
Maximum number of iterations
Default: 30
- convergence=float
-
Percent convergence
Options: 0-100
Default: 98.0
- separation=float
-
Cluster separation
Default: 0.0
- min_size=integer
-
Minimum number of pixels in a class
Default: 17
- reportfile=name
-
Name for output file containing final report
DESCRIPTION¶
i.cluster performs the first pass in the GRASS two-pass unsupervised
classification of imagery, while the GRASS program
i.maxlik
executes the second pass. Both programs must be run to complete the
unsupervised classification.
i.cluster is a clustering algorithm that reads through the (raster)
imagery data and builds pixel clusters based on the spectral reflectances of
the pixels (see Figure). The pixel clusters are imagery categories that can be
related to land cover types on the ground. The spectral distributions of the
clusters (which will be the land cover spectral signatures) are influenced by
six parameters set by the user. The first parameter set by the user is the
initial number of clusters to be discriminated.
|
Fig.: Land use/land cover clustering of LANDSAT scene (simplified)
i.cluster starts by generating spectral signatures for this number of
clusters and "attempts" to end up with this number of clusters
during the clustering process. The resulting number of clusters and their
spectral distributions, however, are also influenced by the range of the
spectral values (category values) in the image files and the other parameters
set by the user. These parameters are: the minimum cluster size, minimum
cluster separation, the percent convergence, the maximum number of iterations,
and the row and column sampling intervals.
The cluster spectral signatures that result are composed of cluster means and
covariance matrices. These cluster means and covariance matrices are used in
the second pass (
i.maxlik) to classify the image. The clusters or
spectral classes result can be related to land cover types on the ground. The
user has to specify the name of group file, the name of subgroup file, the
name of a file to contain result signatures, the initial number of clusters to
be discriminated, and optionally other parameters (see below) where the
group should contain the imagery files that the user wishes to
classify. The
subgroup is a subset of this group. The user must create
a group and subgroup by running the GRASS program
i.group before
running
i.cluster. The subgroup should contain only the imagery band
files that the user wishes to classify. Note that this subgroup must contain
more than one band file. The purpose of the group and subgroup is to collect
map layers for classification or analysis. The
sigfile is the file to
contain result signatures which can be used as input for
i.maxlik. The
classes value is the initial number of clusters to be discriminated; any
parameter values left unspecified are set to their default values.
Flags:¶
- -q
-
Run quietly. Suppresses output of program percent-complete messages and the
time elapsed from the beginning of the program. If this flag is not used,
these messages are printed out.
Parameters:¶
- group=name
-
The name of the group file which contains the imagery files that the user
wishes to classify.
- subgroup=name
-
The name of the subset of the group specified in group option, which must
contain only imagery band files and more than one band file. The user must
create a group and a subgroup by running the GRASS program i.group
before running i.cluster.
- sigfile=name
-
The name assigned to output signature file which contains signatures of
classes and can be used as the input file for the GRASS program
i.maxlik for an unsupervised classification.
- classes=value
-
The number of clusters that will initially be identified in the clustering
process before the iterations begin.
- seed=name
-
The name of a seed signature file is optional. The seed signatures are
signatures that contain cluster means and covariance matrices which were
calculated prior to the current run of i.cluster. They may be
acquired from a previously run of i.cluster or from a supervised
classification signature training site section (e.g., using the signature
file output by i.class). The purpose of seed signatures is to
optimize the cluster decision boundaries (means) for the number of
clusters specified.
- sample=row_interval,col_interval
-
These numbers are optional with default values based on the size of the data
set such that the total pixels to be processed is approximately 10,000
(consider round up).
- iterations=value
-
This parameter determines the maximum number of iterations which is greater
than the number of iterations predicted to achieve the optimum percent
convergence. The default value is 30. If the number of iterations reaches
the maximum designated by the user; the user may want to rerun
i.cluster with a higher number of iterations (see
reportfile).
Default: 30
- convergence=value
-
A high percent convergence is the point at which cluster means become stable
during the iteration process. The default value is 98.0 percent. When
clusters are being created, their means constantly change as pixels are
assigned to them and the means are recalculated to include the new pixel.
After all clusters have been created, i.cluster begins iterations
that change cluster means by maximizing the distances between them. As
these means shift, a higher and higher convergence is approached. Because
means will never become totally static, a percent convergence and a
maximum number of iterations are supplied to stop the iterative process.
The percent convergence should be reached before the maximum number of
iterations. If the maximum number of iterations is reached, it is probable
that the desired percent convergence was not reached. The number of
iterations is reported in the cluster statistics in the report file (see
reportfile).
Default: 98.0
- separation=value
-
This is the minimum separation below which clusters will be merged in the
iteration process. The default value is 0.0. This is an image-specific
number (a "magic" number) that depends on the image data being
classified and the number of final clusters that are acceptable. Its
determination requires experimentation. Note that as the minimum class (or
cluster) separation is increased, the maximum number of iterations should
also be increased to achieve this separation with a high percentage of
convergence (see convergence).
Default: 0.0
- min_size=value
-
This is the minimum number of pixels that will be used to define a cluster,
and is therefore the minimum number of pixels for which means and
covariance matrices will be calculated.
Default: 17
- reportfile=name
-
The reportfile is an optional parameter which contains the result, i.e., the
statistics for each cluster. Also included are the resulting percent
convergence for the clusters, the number of iterations that was required
to achieve the convergence, and the separability matrix.
NOTES¶
Running in command line mode,
i.cluster will overwrite the output
signature file and reportfile (if required by the user) without prompting if
the files existed.
EXAMPLE¶
Preparing the statistics for unsupervised classification of a LANDSAT subscene
in North Carolina:
g.region rast=lsat7_2002_10 -p
# store VIZ, NIR, MIR into group/subgroup
i.group group=my_lsat7_2002 subgroup=my_lsat7_2002 \
input=lsat7_2002_10,lsat7_2002_20,lsat7_2002_30,lsat7_2002_40,lsat7_2002_50,lsat7_2002_70
i.cluster group=my_lsat7_2002 subgroup=my_lsat7_2002 sigfile=sig_clust_lsat2002
\
classes=10 report=rep_clust_lsat2002.txt
To complete the unsupervised classification,
i.maxlik is subsequently
used.
SEE ALSO¶
The GRASS 4
Image Processing manual
i.class, i.group, i.gensig, i.maxlik
AUTHORS¶
Michael Shapiro, U.S.Army Construction Engineering Research Laboratory
Tao Wen, University of Illinois at Urbana-Champaign, Illinois
Last changed: $Date: 2012-12-16 13:47:36 +0100 (Sun, 16 Dec 2012) $
Full index
© 2003-2014 GRASS Development Team