NAME¶
Math::Vector::Real::kdTree - kd-Tree implementation on top of Math::Vector::Real
SYNOPSIS¶
use Math::Vector::Real::kdTree;
use Math::Vector::Real;
use Math::Vector::Real::Random;
my @v = map Math::Vector::Real->random_normal(4), 1..1000;
my $tree = Math::Vector::Real::kdTree->new(@v);
my $ix = $tree->find_nearest_vector(V(0, 0, 0, 0));
say "nearest vector is $ix, $v[$ix]";
DESCRIPTION¶
This module implements a kd-Tree data structure in Perl and common algorithms on
top of it.
Methods¶
The following methods are provided:
- $t = Math::Vector::Real::kdTree->new(@points)
- Creates a new kd-Tree containing the given points.
- $t2 = $t->clone
- Creates a duplicate of the tree. The two trees will share internal read
only data so this method is more efficient in terms of memory usage than
others performing a deep copy.
- my $ix = $t->insert($p0, $p1, ...)
- Inserts the given points into the kd-Tree.
Returns the index assigned to the first point inserted.
- $s = $t->size
- Returns the number of points inside the tree.
- $p = $t->at($ix)
- Returns the point at the given index inside the tree.
- $t->move($ix, $p)
- Moves the point at index $ix to the new given position readjusting the
tree structure accordingly.
- ($ix, $d) = $t->find_nearest_vector($p, $max_d, @but_ix)
- ($ix, $d) = $t->find_nearest_vector($p, $max_d, \%but_ix)
- Find the nearest vector for the given point $p and returns its index and
the distance between the two points (in scalar context the index is
returned).
If $max_d is defined, the search is limited to the points within that
distance
Optionally, a list of point indexes to be excluded from the search can be
passed or, alternatively, a reference to a hash containing the indexes of
the points to be excluded.
- @ix = $t->find_nearest_vector_all_internal
- Returns the index of the nearest vector from the tree.
It is equivalent to the following code (though, it uses a better algorithm):
@ix = map {
scalar $t->nearest_vector($t->at($_), undef, $_)
} 0..($t->size - 1);
- $ix = $t->find_farthest_vector($p, $min_d, @but_ix)
- Find the point from the tree farthest from the given $p.
The optional argument $min_d specifies a minimal distance. Undef is returned
when not point farthest that it is found.
@but_ix specifies points that should not be considered when looking for the
farthest point.
- $ix = $t->find_farthest_vector_internal($ix, $min_d, @but_ix)
- Given the index of a point on the tree this method returns the index of
the farthest vector also from the tree.
- @k = $t->k_means_start($n)
- This method uses the internal tree structure to generate a set of point
that can be used as seeds for other "k_means" methods.
There isn't any guarantee on the quality of the generated seeds, but the
used algorithm seems to perform well in practice.
- @k = $t->k_means_step(@k)
- Performs a step of the Lloyd's algorithm
<http://en.wikipedia.org/wiki/Lloyd%27s_algorithm> for k-means
calculation.
- @k = $t->k_means_loop(@k)
- Iterates until the Lloyd's algorithm converges and returns the final
means.
- @ix = $t->k_means_assign(@k)
- Returns for every point in the three the index of the cluster it belongs
to.
- @ix = $t->find_in_ball($z, $d, $but)
- $n = $t->find_in_ball($z, $d, $but)
- Finds the points inside the tree contained in the hypersphere with center
$z and radius $d.
In scalar context returns the number of points found. In list context
returns the indexes of the points.
If the extra argument $but is provided. The point with that index is
ignored.
- @ix = $t->ordered_by_proximity
- Returns the indexes of the points in an ordered where is likely that the
indexes of near vectors are also in near positions in the list.
k-means¶
The module can be used to calculate the k-means of a set of vectors as follows:
# inputs
my @v = ...; my $k = ...;
# k-mean calculation
my $t = Math::Vector::Real::kdTree->new(@v);
my @means = $t->k_means_start($k);
@means = $t->k_means_loop(@means);
@assign = $t->k_means_assign(@means);
my @cluster = map [], 1..$k;
for (0..$#assign) {
my $cluster_ix = $assign[$_];
my $cluster = $cluster[$cluster_ix];
push @$cluster, $t->at($_);
}
use Data::Dumper;
print Dumper \@cluster;
SEE ALSO¶
Wikipedia k-d Tree entry <
http://en.wikipedia.org/wiki/K-d_tree>.
K-means filtering algorithm
<
https://www.cs.umd.edu/~mount/Projects/KMeans/pami02.pdf>.
Math::Vector::Real
COPYRIGHT AND LICENSE¶
Copyright (C) 2011-2014 by Salvador Fandin~o <sfandino@yahoo.com>
This library is free software; you can redistribute it and/or modify it under
the same terms as Perl itself, either Perl version 5.12.3 or, at your option,
any later version of Perl 5 you may have available.