Classification using UMAP and k-means
This small MATLAB utility sorts cells into different clusters by their waveforms. It performs dimensionality reduction on averaged waveform snippets, and clusters using a k-means algorithm.
You can clone or download this repository and add it to your MATLAB path. It requires the mtools repository, and optionally requires RatCatcher, and CMBHOME for use with the RatCatcher data pipeline.
A CellSorter
object can be instantiated the normal way:
cs = CellSorter;
You can dimensionally reduce data using the dimred
function.
Y = cs.dimred(X);
The data, X
, should be an M x N
matrix,
where M
is the number of observations, and N
is the number of features.
You can cluster the dimensionally-reduced data using the kcluster
function.
labels = cs.kcluster(Y);
If you are using the RatCatcher data pipeline,
you can use the supplied batch function for the CellSorter
protocol.
This will automatically gather waveform snippets from specified CMBHOME
Session objects.
CellSorter
contains several properties:
This property contains a struct
generated by the statset
built-in function.
It contains general properties for clustering algorithms (in this case, k-means).
This property counts the number of desired clusters to be found. This is the 'k' in k-means.
This property counts the number of desired dimensions for dimensionality reduction. It is generally advisable to keep this set to the default of 2.
A logical flag -- verbosity = true
means CellSorter
will output more informational text.
Which dimensionality-reduction algorithm to use?
CellSorter
supports pca
, t-SNE
, FIt-SNE
, and UMAP
.
For fast-Fourier transform-accelerated interpolation-based t-SNE,
you will need the FIt-SNE package.
For uniform manifold approximation and projection,
you will need the UMAP MATLAB wrapper.