NAME¶

Math::Vector::Real::kdTree - kd-Tree implementation on top of Math::Vector::Real

SYNOPSIS¶

  use Math::Vector::Real::kdTree;
  use Math::Vector::Real;
  use Math::Vector::Real::Random;
  my @v = map Math::Vector::Real->random_normal(4), 1..1000;
  my $tree = Math::Vector::Real::kdTree->new(@v);
  my $ix = $tree->find_nearest_vector(V(0, 0, 0, 0));
  say "nearest vector is $ix, $v[$ix]";

DESCRIPTION¶

This module implements a kd-Tree data structure in Perl and common algorithms on top of it.

Methods¶

The following methods are provided:

$t = Math::Vector::Real::kdTree->new(@points): Creates a new kd-Tree containing the given points.

$t2 = $t->clone: Creates a duplicate of the tree. The two trees will share internal read only data so this method is more efficient in terms of memory usage than others performing a deep copy.

my $ix = $t->insert($p0, $p1, ...): Inserts the given points into the kd-Tree.
Returns the index assigned to the first point inserted.

$s = $t->size: Returns the number of points inside the tree.

$p = $t->at($ix): Returns the point at the given index inside the tree.

$t->move($ix, $p): Moves the point at index $ix to the new given position readjusting the tree structure accordingly.

($ix, $d) = $t->find_nearest_vector($p, $max_d, @but_ix)

($ix, $d) = $t->find_nearest_vector($p, $max_d, \%but_ix): Find the nearest vector for the given point $p and returns its index and the distance between the two points (in scalar context the index is returned).
If $max_d is defined, the search is limited to the points within that distance
Optionally, a list of point indexes to be excluded from the search can be passed or, alternatively, a reference to a hash containing the indexes of the points to be excluded.

@ix = $t->find_nearest_vector_all_internal

Returns the index of the nearest vector from the tree.

It is equivalent to the following code (though, it uses a better algorithm):

  @ix = map {
            scalar $t->nearest_vector($t->at($_), undef, $_)
        } 0..($t->size - 1);

$ix = $t->find_farthest_vector($p, $min_d, @but_ix): Find the point from the tree farthest from the given $p.
The optional argument $min_d specifies a minimal distance. Undef is returned when not point farthest that it is found.
@but_ix specifies points that should not be considered when looking for the farthest point.

$ix = $t->find_farthest_vector_internal($ix, $min_d, @but_ix): Given the index of a point on the tree this method returns the index of the farthest vector also from the tree.

@k = $t->k_means_start($n): This method uses the internal tree structure to generate a set of point that can be used as seeds for other "k_means" methods.
There isn't any guarantee on the quality of the generated seeds, but the used algorithm seems to perform well in practice.

@k = $t->k_means_step(@k): Performs a step of the Lloyd's algorithm <http://en.wikipedia.org/wiki/Lloyd%27s_algorithm> for k-means calculation.

@k = $t->k_means_loop(@k): Iterates until the Lloyd's algorithm converges and returns the final means.

@ix = $t->k_means_assign(@k): Returns for every point in the three the index of the cluster it belongs to.

@ix = $t->find_in_ball($z, $d, $but)

$n = $t->find_in_ball($z, $d, $but): Finds the points inside the tree contained in the hypersphere with center $z and radius $d.
In scalar context returns the number of points found. In list context returns the indexes of the points.
If the extra argument $but is provided. The point with that index is ignored.

@ix = $t->ordered_by_proximity: Returns the indexes of the points in an ordered where is likely that the indexes of near vectors are also in near positions in the list.

k-means¶

The module can be used to calculate the k-means of a set of vectors as follows:

  # inputs
  my @v = ...; my $k = ...;
  
  # k-mean calculation
  my $t = Math::Vector::Real::kdTree->new(@v);
  my @means = $t->k_means_start($k);
  @means = $t->k_means_loop(@means);
  @assign = $t->k_means_assign(@means);
  my @cluster = map [], 1..$k;
  for (0..$#assign) {
    my $cluster_ix = $assign[$_];
    my $cluster = $cluster[$cluster_ix];
    push @$cluster, $t->at($_);
  }
  
  use Data::Dumper;
  print Dumper \@cluster;

COPYRIGHT AND LICENSE¶

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.12.3 or, at your option, any later version of Perl 5 you may have available.

2014-10-07

perl v5.20.1

Source file:	Math::Vector::Real::kdTree.3pm.en.gz (from libmath-vector-real-kdtree-perl 0.11-1)
Source last updated:	2014-10-09T19:02:40Z
Converted to HTML:	2018-08-09T23:07:25Z