.TH v.kcv 1grass "" "GRASS 6.4.4" "Grass User's Manual" .SH NAME \fI\fBv.kcv\fR\fR - Randomly partition points into test/train sets. .SH KEYWORDS vector, statistics .SH SYNOPSIS \fBv.kcv\fR .br \fBv.kcv help\fR .br \fBv.kcv\fR [\-\fBdq\fR] \fBinput\fR=\fIname\fR \fBoutput\fR=\fIname\fR \fBk\fR=\fIinteger\fR \fBcolumn\fR=\fIstring\fR [\-\-\fBoverwrite\fR] [\-\-\fBverbose\fR] [\-\-\fBquiet\fR] .SS Flags: .IP "\fB\-d\fR" 4m .br Use drand48() .IP "\fB\-q\fR" 4m .br Quiet .IP "\fB\-\-overwrite\fR" 4m .br Allow output files to overwrite existing files .IP "\fB\-\-verbose\fR" 4m .br Verbose module output .IP "\fB\-\-quiet\fR" 4m .br Quiet module output .PP .SS Parameters: .IP "\fBinput\fR=\fIname\fR" 4m .br Name of input vector map .IP "\fBoutput\fR=\fIname\fR" 4m .br Name for output vector map .IP "\fBk\fR=\fIinteger\fR" 4m .br Number of partitions .br Options: \fI1-32767\fR .IP "\fBcolumn\fR=\fIstring\fR" 4m .br Name for new column to which partition number is written .br Default: \fIpart\fR .PP .SH DESCRIPTION \fIv.kcv\fR randomly divides a points lists into \fIk\fR sets of test/train data (for \fBk\fR-fold \fBc\fRross \fBv\fRalidation). Test partitions are mutually exclusive. That is, a point will appear in only one test partition and \fIk-1\fR training partitions. The module generates a random point using the selected random number generator and then finds the closest point to it. This site is removed from the candidate list (meaning that it will not be selected for any other test set) and saved in the first test partition file. This is repeated until enough points have been selected for the test partition. The number of points chosen for test partitions depends upon the number of sites available and the number of partitions chosen (this number is made as consistent as possible while ensuring that all sites will be chosen for testing). This process of filling up a test partition is done \fIk\fR times. .SH NOTES An ideal random sites generator will follow a Poisson distribution and will only be as random as the original sites. This module simply divides vector points up in a random manner. .PP Be warned that random number generation occurs over the intervals defined by the current region of the map. .PP This program may not work properly with Lat-long data. .SH EXAMPLES All examples are based on the North Carolina sample dataset. \fC .DS .br g.copy vect=geonames_wake,my_geonames_wake .br v.kcv input=my_geonames_wake output=my_geonames_wake_kvc column=part k=10 .br .DE \fR .PP \fC .DS .br g.copy vect=geodetic_pts,my_geodetic_pts .br v.kcv input=my_geodetic_pts output=my_geodetic_pts_kvc column=part k=10 .br .DE \fR .SH SEE ALSO \fI v.random, g.region \fR .SH AUTHOR James Darrell McCauley, .br when he was at: Agricultural Engineering Purdue University .PP 27 Jan 1994: fixed RAND_MAX for Solaris 2.3 .br 13 Sep 2000: released under GPL .br Updated to 5.7 Radim Blazek 10 / 2004 .br OGR support by Martin Landa (2009) .br Speed-up by Jan Vandrol and Jan Ruzicka (2013) .PP \fILast changed: $Date: 2014-05-01 22:00:52 +0200 (Thu, 01 May 2014) $\fR .PP Full index .PP © 2003-2014 GRASS Development Team