Skip to content

Latest commit

 

History

History
157 lines (116 loc) · 5.18 KB

README-work.md

File metadata and controls

157 lines (116 loc) · 5.18 KB

Algorithm::KDimensionalTree

Raku package with implementations of the K-Dimensional Tree (K-D Tree) algorithm.

Remark: This package should not be confused with "Algorithm::KdTree", [ITp1], which provides Raku bindings to a C-implementation of the K-D Tree algorithm. (A primary motivation for making this package, "Algorithm::KDimensionalTree", was to have a pure-Raku implementation.)

Remark: The versions "ver<0.1.0+>" with the API "api<1>" of this package are better utilized via the package "Math::Nearest".

Compared to the simple scanning algorithm this implementation of the K-D Tree algorithm is often 2÷15 times faster for randomly generated low dimensional data. (Say. less than 6D.) It can be 200+ times faster for "real life" data, like, Geo-locations of USA cities.

The implementation is tested for correctness against Mathematica's Nearest. (See the resource files.)

Features

  • Finds both top-k Nearest Neighbors (NNs).
    • See the method k-nearest.
  • Finds NNs within a ball with a given radius.
    • See the method nearest-within-ball.
  • Can return indexes and distances for the found NNs.
    • The result shape is controlled with the adverb :values.
  • Works with arrays of numbers, arrays of arrays of numbers, and arrays of pairs.
  • Utilizes distance functions from "Math::DistanceFunctions".
    • Which can be specified by their string names, like, "bray-curtis" or "cosine-distance".
  • Allows custom distance functions to be used.

Installation

From Zef ecosystem:

zef install Algorithm::KDimensionalTree

From GitHub:

zef install https://github.com/antononcube/Raku-Algorithm-KDimensionalTree.git

Usage examples

Setup

use Algorithm::KDimensionalTree;
use Data::TypeSystem;
use Text::Plot;

Set of points

Make a random set of points:

my @points = ([(^100).rand, (^100).rand] xx 30).unique;
deduce-type(@points);

Create the K-dimensional tree object

my $kdTree = Algorithm::KDimensionalTree.new(@points);

say $kdTree;

Nearest k-neighbors

Use as a search point one from the points set:

my @searchPoint = |@points.head;

Find 6 nearest neighbors:

my @res = $kdTree.k-nearest(@searchPoint, 6);
.say for @res;

Plot

Plot the points, the found nearest neighbors, and the search point:

my @point-char =  <* ⏺ ▲>;
say <data nns search> Z=> @point-char;
say text-list-plot(
[@points, @res, [@searchPoint,]],
:@point-char,
x-limit => (0, 100),
y-limit => (0, 100),
width => 60,
height => 20);

TODO

  • TODO Implementation
    • DONE Using distance functions from an "universal" package
      • E.g. "Math::DistanceFunctions"
    • DONE Using distance functions other than Euclidean distance
    • DONE Returning properties
      • DONE Points
      • DONE Indexes
      • DONE Distances
      • DONE Labels
      • DONE Combinations of those
      • This is implemented by should be removed.
        • There is another package -- "Math::Nearest" -- that handles all nearest neighbors finders.
        • Version "0.1.0 with "api<1>" is without the .nearest method.
    • DONE Having an umbrella function nearest
      • Instead of creating a KDTree object etc.
      • This might require making a functor nearest-function
      • This is better done in a different package
    • TODO Make the nearest methods work with strings
      • For example, using Hamming distance over a collection of words.
      • Requires using the distance function as a comparator for the splitting hyperplanes.
      • This means, any objects can be used as long as they provide a distance function.
  • DONE Extensive correctness tests
    • Derived with Mathematica / WL (see the resources)
  • TODO Documentation
    • DONE Basic usage examples with text plots
    • TODO More extensive documentation with a Jupyter notebook
      • Using "JavaScript::D3".
    • TODO Corresponding blog post
    • MAYBE Corresponding video

References

[AAp1] Anton Antonov, Math::DistanceFunctions Raku package, (2024), GitHub/antononcube.

[AAp2] Anton Antonov, Math::Nearest Raku package, (2024), GitHub/antononcube.

[AAp3] Anton Antonov, Data::TypeSystem Raku package, (2023), GitHub/antononcube.

[AAp4] Anton Antonov, Text::Plot Raku package, (2022), GitHub/antononcube.

[ITp1] Itsuki Toyota, Algorithm::KdTree Raku package, (2016-2024), GitHub/titsuki.