CLASSIX is a fast and explainable clustering algorithm based on sorting. Here are a few highlights:

Ability to cluster low and high-dimensional data of arbitrary shape efficiently.
Ability to detect and deal with outliers in the data.
Ability to provide textual explanations for the generated clusters.
Full reproducibility of all tests in the accompanying paper.
Support of Cython compilation.

CLASSIX is a contrived acronym of CLustering by Aggregation with Sorting-based Indexing and the letter X for explainability. CLASSIX clustering consists of two phases, namely a greedy aggregation phase of the sorted data into groups of nearby data points, followed by a merging phase of groups into clusters. The algorithm is controlled by two parameters, namely the distance parameter radius for the group aggregation and a minPts parameter controlling the minimal cluster size.

Installing and example

CLASSIX has the following dependencies for its clustering functionality:

cython
numpy
scipy
requests

and requires the following packages for data visualization:

matplotlib
pandas

To install the current CLASSIX release via PIP use:

pip install classixclustering

To check the CLASSIX installation you can use:

python -m pip show classixclustering

Download the repository via:

git clone https://github.com/nla-group/classix.git

Example usage:

from sklearn import datasets
from classix import CLASSIX

# Generate synthetic data
X, y = datasets.make_blobs(n_samples=2000000, centers=4, n_features=10, random_state=1)

# Employ CLASSIX clustering
clx = CLASSIX(sorting='pca', verbose=1)
clx.fit(X)

Citation

@techreport{CG22b,
  title   = {Fast and explainable clustering based on sorting},
  author  = {Chen, Xinye and G\"{u}ttel, Stefan},
  year    = {2022},
  number  = {arXiv:2202.01456},
  pages   = {25},
  institution = {The University of Manchester},
  address = {UK},
  type    = {arXiv EPrint},
  url     = {https://arxiv.org/abs/2202.01456}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.rst

README.rst

Installing and example

Citation

Files

README.rst

Latest commit

History

README.rst

File metadata and controls

Installing and example

Citation