Raku package with implementations of the K-Dimensional Tree (K-D Tree) algorithm.
Remark: This package should not be confused with "Algorithm::KdTree", [ITp1], which provides Raku bindings to a C-implementation of the K-D Tree algorithm. (A primary motivation for making this package, "Algorithm::KDimensionalTree", was to have a pure-Raku implementation.)
Remark: The versions "ver<0.1.0+>" with the API "api<1>" of this package are better utilized via the package "Math::Nearest".
Compared to the simple scanning algorithm this implementation of the K-D Tree algorithm is often 2÷15 times faster for randomly generated low dimensional data. (Say. less than 6D.) It can be 200+ times faster for "real life" data, like, Geo-locations of USA cities.
The implementation is tested for correctness against Mathematica's Nearest
.
(See the resource files.)
- Finds both top-k Nearest Neighbors (NNs).
- See the method
k-nearest
.
- See the method
- Finds NNs within a ball with a given radius.
- See the method
nearest-within-ball
.
- See the method
- Can return indexes and distances for the found NNs.
- The result shape is controlled with the adverb
:values
.
- The result shape is controlled with the adverb
- Works with arrays of numbers, arrays of arrays of numbers, and arrays of pairs.
- Utilizes distance functions from "Math::DistanceFunctions".
- Which can be specified by their string names, like, "bray-curtis" or "cosine-distance".
- Allows custom distance functions to be used.
From Zef ecosystem:
zef install Algorithm::KDimensionalTree
From GitHub:
zef install https://github.com/antononcube/Raku-Algorithm-KDimensionalTree.git
use Algorithm::KDimensionalTree;
use Data::TypeSystem;
use Text::Plot;
Make a random set of points:
my @points = ([(^100).rand, (^100).rand] xx 30).unique;
deduce-type(@points);
my $kdTree = Algorithm::KDimensionalTree.new(@points);
say $kdTree;
Use as a search point one from the points set:
my @searchPoint = |@points.head;
Find 6 nearest neighbors:
my @res = $kdTree.k-nearest(@searchPoint, 6);
.say for @res;
Plot the points, the found nearest neighbors, and the search point:
my @point-char = <* ⏺ ▲>;
say <data nns search> Z=> @point-char;
say text-list-plot(
[@points, @res, [@searchPoint,]],
:@point-char,
x-limit => (0, 100),
y-limit => (0, 100),
width => 60,
height => 20);
- TODO Implementation
- DONE Using distance functions from an "universal" package
- E.g. "Math::DistanceFunctions"
- DONE Using distance functions other than Euclidean distance
- DONE Returning properties
- DONE Points
- DONE Indexes
- DONE Distances
- DONE Labels
- DONE Combinations of those
- This is implemented by should be removed.
- There is another package -- "Math::Nearest" -- that handles all nearest neighbors finders.
- Version "0.1.0 with "api<1>" is without the
.nearest
method.
- DONE Having an umbrella function
nearest
- Instead of creating a KDTree object etc.
- This might require making a functor
nearest-function
- This is better done in a different package
- See "Math::Nearest"
- TODO Make the nearest methods work with strings
- For example, using Hamming distance over a collection of words.
- Requires using the distance function as a comparator for the splitting hyperplanes.
- This means, any objects can be used as long as they provide a distance function.
- DONE Using distance functions from an "universal" package
- DONE Extensive correctness tests
- Derived with Mathematica / WL (see the resources)
- TODO Documentation
- DONE Basic usage examples with text plots
- TODO More extensive documentation with a Jupyter notebook
- Using "JavaScript::D3".
- TODO Corresponding blog post
- MAYBE Corresponding video
[AAp1] Anton Antonov, Math::DistanceFunctions Raku package, (2024), GitHub/antononcube.
[AAp2] Anton Antonov, Math::Nearest Raku package, (2024), GitHub/antononcube.
[AAp3] Anton Antonov, Data::TypeSystem Raku package, (2023), GitHub/antononcube.
[AAp4] Anton Antonov, Text::Plot Raku package, (2022), GitHub/antononcube.
[ITp1] Itsuki Toyota, Algorithm::KdTree Raku package, (2016-2024), GitHub/titsuki.