Scroll to navigation

Math::Vector::Real::kdTree(3pm) User Contributed Perl Documentation Math::Vector::Real::kdTree(3pm)
 

NAME

Math::Vector::Real::kdTree - kd-Tree implementation on top of Math::Vector::Real

SYNOPSIS

  use Math::Vector::Real::kdTree;
  use Math::Vector::Real;
  use Math::Vector::Real::Random;
  my @v = map Math::Vector::Real->random_normal(4), 1..1000;
  my $tree = Math::Vector::Real::kdTree->new(@v);
  my $ix = $tree->find_nearest_vector(V(0, 0, 0, 0));
  say "nearest vector is $ix, $v[$ix]";

DESCRIPTION

This module implements a kd-Tree data structure in Perl and common algorithms on top of it.

Methods

The following methods are provided:
$t = Math::Vector::Real::kdTree->new(@points)
Creates a new kd-Tree containing the given points.
$t2 = $t->clone
Creates a duplicate of the tree. The two trees will share internal read only data so this method is more efficient in terms of memory usage than others performing a deep copy.
my $ix = $t->insert($p0, $p1, ...)
Inserts the given points into the kd-Tree.
Returns the index assigned to the first point inserted.
$s = $t->size
Returns the number of points inside the tree.
$p = $t->at($ix)
Returns the point at the given index inside the tree.
$t->move($ix, $p)
Moves the point at index $ix to the new given position readjusting the tree structure accordingly.
($ix, $d) = $t->find_nearest_vector($p, $max_d, @but_ix)
($ix, $d) = $t->find_nearest_vector($p, $max_d, \%but_ix)
Find the nearest vector for the given point $p and returns its index and the distance between the two points (in scalar context the index is returned).
If $max_d is defined, the search is limited to the points within that distance
Optionally, a list of point indexes to be excluded from the search can be passed or, alternatively, a reference to a hash containing the indexes of the points to be excluded.
@ix = $t->find_nearest_vector_all_internal
Returns the index of the nearest vector from the tree.
It is equivalent to the following code (though, it uses a better algorithm):
  @ix = map {
            scalar $t->nearest_vector($t->at($_), undef, $_)
        } 0..($t->size - 1);
    
$ix = $t->find_farthest_vector($p, $min_d, @but_ix)
Find the point from the tree farthest from the given $p.
The optional argument $min_d specifies a minimal distance. Undef is returned when not point farthest that it is found.
@but_ix specifies points that should not be considered when looking for the farthest point.
$ix = $t->find_farthest_vector_internal($ix, $min_d, @but_ix)
Given the index of a point on the tree this method returns the index of the farthest vector also from the tree.
@k = $t->k_means_start($n)
This method uses the internal tree structure to generate a set of point that can be used as seeds for other "k_means" methods.
There isn't any guarantee on the quality of the generated seeds, but the used algorithm seems to perform well in practice.
@k = $t->k_means_step(@k)
Performs a step of the Lloyd's algorithm <http://en.wikipedia.org/wiki/Lloyd%27s_algorithm> for k-means calculation.
@k = $t->k_means_loop(@k)
Iterates until the Lloyd's algorithm converges and returns the final means.
@ix = $t->k_means_assign(@k)
Returns for every point in the three the index of the cluster it belongs to.
@ix = $t->find_in_ball($z, $d, $but)
$n = $t->find_in_ball($z, $d, $but)
Finds the points inside the tree contained in the hypersphere with center $z and radius $d.
In scalar context returns the number of points found. In list context returns the indexes of the points.
If the extra argument $but is provided. The point with that index is ignored.
@ix = $t->ordered_by_proximity
Returns the indexes of the points in an ordered where is likely that the indexes of near vectors are also in near positions in the list.

k-means

The module can be used to calculate the k-means of a set of vectors as follows:
  # inputs
  my @v = ...; my $k = ...;
  
  # k-mean calculation
  my $t = Math::Vector::Real::kdTree->new(@v);
  my @means = $t->k_means_start($k);
  @means = $t->k_means_loop(@means);
  @assign = $t->k_means_assign(@means);
  my @cluster = map [], 1..$k;
  for (0..$#assign) {
    my $cluster_ix = $assign[$_];
    my $cluster = $cluster[$cluster_ix];
    push @$cluster, $t->at($_);
  }
  
  use Data::Dumper;
  print Dumper \@cluster;

SEE ALSO

Wikipedia k-d Tree entry <http://en.wikipedia.org/wiki/K-d_tree>.
K-means filtering algorithm <https://www.cs.umd.edu/~mount/Projects/KMeans/pami02.pdf>.
Math::Vector::Real

COPYRIGHT AND LICENSE

Copyright (C) 2011-2014 by Salvador Fandin~o <sfandino@yahoo.com>
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.12.3 or, at your option, any later version of Perl 5 you may have available.
2014-10-07 perl v5.20.1