.TH "mlpack::kmeans::RefinedStart" 3 "Tue Sep 9 2014" "Version 1.0.10" "MLPACK" \" -*- nroff -*- .ad l .nh .SH NAME mlpack::kmeans::RefinedStart \- .PP A refined approach for choosing initial points for k-means clustering\&. .SH SYNOPSIS .br .PP .SS "Public Member Functions" .in +1c .ti -1c .RI "\fBRefinedStart\fP (const size_t \fBsamplings\fP=100, const double \fBpercentage\fP=0\&.02)" .br .RI "\fICreate the \fBRefinedStart\fP object, optionally specifying parameters for the number of samplings to perform and the percentage of the dataset to use in each sampling\&. \fP" .ti -1c .RI "template void \fBCluster\fP (const MatType &data, const size_t clusters, arma::Col< size_t > &assignments) const " .br .RI "\fIPartition the given dataset into the given number of clusters according to the random sampling scheme outlined in Bradley and Fayyad's paper\&. \fP" .ti -1c .RI "double \fBPercentage\fP () const " .br .RI "\fIGet the percentage of the data used by each subsampling\&. \fP" .ti -1c .RI "double & \fBPercentage\fP ()" .br .RI "\fIModify the percentage of the data used by each subsampling\&. \fP" .ti -1c .RI "size_t \fBSamplings\fP () const " .br .RI "\fIGet the number of samplings that will be performed\&. \fP" .ti -1c .RI "size_t & \fBSamplings\fP ()" .br .RI "\fIModify the number of samplings that will be performed\&. \fP" .in -1c .SS "Private Attributes" .in +1c .ti -1c .RI "double \fBpercentage\fP" .br .RI "\fIThe percentage of the data to use for each subsampling\&. \fP" .ti -1c .RI "size_t \fBsamplings\fP" .br .RI "\fIThe number of samplings to perform\&. \fP" .in -1c .SH "Detailed Description" .PP A refined approach for choosing initial points for k-means clustering\&. This approach runs k-means several times on random subsets of the data, and then clusters those solutions to select refined initial cluster assignments\&. It is an implementation of the following paper: .PP {bradley1998refining, title={Refining initial points for k-means clustering}, author={Bradley, Paul S and Fayyad, Usama M}, booktitle={Proceedings of the Fifteenth International Conference on Machine Learning (ICML 1998)}, volume={66}, year={1998} } .PP Definition at line 47 of file refined_start\&.hpp\&. .SH "Constructor & Destructor Documentation" .PP .SS "mlpack::kmeans::RefinedStart::RefinedStart (const size_tsamplings = \fC100\fP, const doublepercentage = \fC0\&.02\fP)\fC [inline]\fP" .PP Create the \fBRefinedStart\fP object, optionally specifying parameters for the number of samplings to perform and the percentage of the dataset to use in each sampling\&. .PP Definition at line 55 of file refined_start\&.hpp\&. .SH "Member Function Documentation" .PP .SS "template void mlpack::kmeans::RefinedStart::Cluster (const MatType &data, const size_tclusters, arma::Col< size_t > &assignments) const" .PP Partition the given dataset into the given number of clusters according to the random sampling scheme outlined in Bradley and Fayyad's paper\&. .PP \fBTemplate Parameters:\fP .RS 4 \fIMatType\fP Type of data (arma::mat or arma::sp_mat)\&. .RE .PP \fBParameters:\fP .RS 4 \fIdata\fP Dataset to partition\&. .br \fIclusters\fP Number of clusters to split dataset into\&. .br \fIassignments\fP Vector to store cluster assignments into\&. Values will be between 0 and (clusters - 1)\&. .RE .PP .SS "double mlpack::kmeans::RefinedStart::Percentage () const\fC [inline]\fP" .PP Get the percentage of the data used by each subsampling\&. .PP Definition at line 80 of file refined_start\&.hpp\&. .PP References percentage\&. .SS "double& mlpack::kmeans::RefinedStart::Percentage ()\fC [inline]\fP" .PP Modify the percentage of the data used by each subsampling\&. .PP Definition at line 82 of file refined_start\&.hpp\&. .PP References percentage\&. .SS "size_t mlpack::kmeans::RefinedStart::Samplings () const\fC [inline]\fP" .PP Get the number of samplings that will be performed\&. .PP Definition at line 75 of file refined_start\&.hpp\&. .PP References samplings\&. .SS "size_t& mlpack::kmeans::RefinedStart::Samplings ()\fC [inline]\fP" .PP Modify the number of samplings that will be performed\&. .PP Definition at line 77 of file refined_start\&.hpp\&. .PP References samplings\&. .SH "Member Data Documentation" .PP .SS "double mlpack::kmeans::RefinedStart::percentage\fC [private]\fP" .PP The percentage of the data to use for each subsampling\&. .PP Definition at line 88 of file refined_start\&.hpp\&. .PP Referenced by Percentage()\&. .SS "size_t mlpack::kmeans::RefinedStart::samplings\fC [private]\fP" .PP The number of samplings to perform\&. .PP Definition at line 86 of file refined_start\&.hpp\&. .PP Referenced by Samplings()\&. .SH "Author" .PP Generated automatically by Doxygen for MLPACK from the source code\&.