MATLAB implementation of Selective Sampling-based Scalable Sparse Subspace Clustering (NeurIPS '19). S5C algorithm selects subsamples based on the approximated subgradients and linearly scales with the number of data points in terms of time and memory requirements. It provides theoretical guarantees of the correctness of the solution.
Mex file representation_learning/cdescentCycleC.mexa64 is built for 64-bit Linux. If running on other platform, first compile representation_learning/cdescentCycleC.c to create mex file for your platform (see Matlab documentation).
Examples how to run the code are given in run_examples/ directory. Example scripts are given for all datasets used in the paper.
Five datasets used in the paper (MNIST, Extended Yale B, Hopkins155, Letter-rec, and COIL100) can be found in the datasets directory.
CIFAR-10 and Devanagari are not included due to their size. CIFAR-10 can be downloaded from https://www.cs.toronto.edu/~kriz/cifar.html. Devanagari can be downloaded from https://archive.ics.uci.edu/ml/datasets/Devanagari+Handwritten+Character+Dataset.
When using the code in your research work, please cite "Selective Sampling-based Scalable Sparse Subspace Clustering" by Shin Matsushima and Maria Brbic.
@incollection{matsushima19_s5c,
title={Selective Sampling-based Scalable Sparse Subspace Clustering},
author={Matsushima, Shin and Brbi\'c, Maria},
booktitle = {Advances in Neural Information Processing Systems 32},
pages = {12416--12425},
year = {2019},
publisher = {Curran Associates, Inc.},
url = {http://papers.nips.cc/paper/9408-selective-sampling-based-scalable-sparse-subspace-clustering.pdf}
}