Skip to content
This repository has been archived by the owner on Feb 5, 2024. It is now read-only.

Basic Usage walkthrough

Srinivas Gorur-Shandilya edited this page May 11, 2020 · 17 revisions

First steps

Prepare your data

For each experiment you want to analyze as a whole, put all .abf/.smr files in one folder. Crabsort will produce an error if you have files with different numbers or names of channels. So, if you changed the protocol halfway through, put those files in another folder to analyze separately. If you’re not sure if all your ABF files are in the same format, run crabsort.convert2crabFormat in the directory. This will convert all ABF files into MATLAB-compatible .crab files (these are simply .mat files) and will also “harmonize” all files by making sure you have the same # of channels, etc.

Launch crabsort

In the MATLAB command window, type crabsort – this will open the program.

Load data

Click on "select data file" and load the first .abf file you want to work on. I usually start with some file in the middle of a baseline recording. Crabsort will now check that all the .abf/.smr files in your folder are compatible.

Channel information

Once you load thee file, you should get a window that looks similar to this. Hide any channels you don't want to look at or analyze spikes for. You can show hidden channels again by checking the channel under the "channels" dropdown menu

On the top left of each channel, if you hover over the area there are a bunch of options. The zoom options are most commonly used. Click on either the + or - to change the x-scale. (whenever you change the scale, you need to click on the magnifying glass again (so it’s not blue anymore) before you can do anything else) Change your x-scale so you can see a few cycles, or enough to get an idea of whether you are capturing all the spikes you want to ID. You can also click on the top menu “view” bar and select “full trace” to see the whole file.

Now label the channels. For example, this is an lvn.

Once you label the channel, and sort spikes for that channel, this is permanent and labels it for all files in the folder. If you make a mistake, go to the folder containing spikes (see below for where it is) and delete the crabsort.common file, then start again. Because this can mess everything up, it’s usually best to label all the channels right away, so you don’t need to go back later

  • If you need to, you can customize channel names and the cells on different nerves in the preference file. Type “edit pref” into MATLAB to check out the things you can change. If you make edits, go to crabsort>tools>reload Preferences to load the new prefs.

Manually sorting spikes

Ignoring artifacts in a file

Detecting spikes

Choose whether you want to find +ve or -ve spikes on this channel, depending on the recording quality/relative amplitudes. Click the button to toggle between them.

Click “Find spikes…” – a new dialogue box will pop up with spike detection options. Play around with these (particularly the spike prominence slider), until you’ve selected all the spikes you need. It will be best to err on the side of selecting more spikes than too few. “Putative spikes” are marked with magenta circles.

Sorting spikes

Close the Spike detection parameters window. The magenta circles should now turn to green circles. Now click on the dropdown menu in the dimensionality reduction box: There are some choices for how to cluster your spikes

  • AmplitudeShapePCA – PCA, but it prioritizes the amplitude of spikes as one of the principle components. Use this one if you want PCA and your spikes are of distinctly different heights.
  • PCA – usually the best choice for extracellular data. Quickest clustering method. (for this example, I used PCA)
  • Pass – use only for channels in which you are SURE all spikes are real, and only one unit on the channel (e.g. a really clean pdn or intracellular data). Use this if you plan to assign all putative spikes to a single neuron
  • SpikePeakValue – sorts based on spike amplitude alone
  • tSNE – alternate clustering method that maximizes distance between clusters. Can be more accurate than PCA, takes more computational power and more time to run.
  • uMAP – the most powerful dimensionality reduction algo out there. You will need an external toolbox (https://github.com/sg-s/umap-matlab-wrapper) for this to work

Now, In the Cluster & Sort box, select “Manual Cluster” from the dropdown menu. (If you used the “Pass” clustering method, you should click “AllToNeuron” and you’re done with spike sorting)

A new dialogue box will appear with a scatterplot of all the spikes, clustered in some way. Each dot is one spike. You can click on any dot, and a thin red line in the trace plot will indicate which spike that dot corresponds to. (You can change the size of this “context window” in the pref file if it is too zoomed in/out for you, I prefer 0.5s). If you click on the dropdown menu, you can assign this spike to any of the neurons on the nerve, or as noise. (in the spike I have highlighted here, it’s noise because it is just a reverberation near the LP spike).

Click around a bunch in any given cluster. If all the spikes in one cluster seem to be consistent, then you can label them all as a group by clicking on the neuron name (or 'Noise') in the dropdown menu. The whole background of the box will then become colored. This lets you know you can now draw a circle around the spikes you want to attribute to a category.

The dots will now all turn a color, indicating that they are categorized. Continue for all the clusters you want to ID. Any dots you don’t categorize will be automatically set as noise (so you don’t need to do 100%).

Close the clustering window.

Spike verification and cleanup

Now check that all the spikes in this file are correctly sorted. You can use the manual override options on the far right.

Select which cell you want to add and click on that peak to add a spike as that cell. Alternatively, right click (it’s easiest to use a mouse, but if you don’t have one you can ctrl+click) to remove a spike classification. For example, if an “LP” was marked as a “PD” by mistake, I would right click that peak to remove the circle LP marker, then select the “PD” from the dropdown menu in the manual override box. I’d click the peak and it would then get a marker as a PD. Continue until you’ve correctly labeled all spikes in the file. This is often the most time-consuming part. If you find you have a ton of mis-classified spikes, go back and re-cluster, maybe try a different clustering technique.

Shortcuts:

  • Shift + up arrow goes to the statistically weirdest spike. Useful to find outliers
  • Shift + down arrow goes to the next statistically weird spike. Useful to find outliers

The Neural network

Training a new neural network

  • Once you are happy with your spikes. press “g” on the keyboard to generate a neural network from your spike data. (If this doesn’t work, check you don’t have any zoom keys or other plotting tools highlighted. Then click on the trace and try again.)
    • You’ll know it’s working because the “No data” that used to be under the channel name on the far left will now say “Training”. The number below indicates the training accuracy
    • Wait until the neural net either says “idle”, or it starts to waver around some percentage without going up or down much. Ideally, you want it to be 100% and IDLE. It will stop training when accuracy is above 98%. This threshold can be changed in your pref.m file.

In order to properly train all channels in a file, your work order should be as follows:

  1. Sort spikes manually on channel A
  2. Train on channel A
  3. Sort spikes manually on channel B
  4. Train on channel B

Continue until you have generated a neural net for all the channels you want to analyze.

Switching to a new file

  • Now click to the next file in the data set. (right arrow next to the select data file dropdown menu). You can also go to the next file by pressing the right arrow on your keyboard. Click on the trace you want to analyze and press “p” on the keyboard, which will predict the spikes based on your neural net.

  • Check how good the neural net did. There will be black triangles over spikes the neural net is not sure about. Pressing spacebar jumps to the next uncertain spike. However, it is good to check all the spikes at least on the first file you try. Right click on the triangles to remove them. Repeat the “Verification” step, re-categorizing anything that has been mislabeled. You’ll notice that on the left of the file, the neural net is watching everything you do, and re-training based on your manual inputs.

  • Repeat with a few other files in the data set, loading the file and pressing “p” to predict the spikes. Typically, I next try a file from the end of the experiment (usually wash data) and one or two in the middle of any perturbation I performed, to make sure the network does OK with all the files.

    • You’re never going to be 100% perfect or get rid of all possible uncertainty. Do as much as you can, until the large majority of spikes are properly labeled.

Making predictions

Go to the Automate dropdown menu and select how you want the program to automate. In this case, I want to do “This channel, all files”. Click that, then click “Start”

Once you start this, it will go through EVERY file in the folder. So make sure you are confident in your spike sorting and ready to wait a while for the step to complete before you click start To abort the automation, click on crabsort>automate>Stop.

You should watch for a few files at least to make sure this is working properly. You’ll be able to see MATLAB finding spikes and sorting in real time.

Analyzing Spike Data

Data Storage

  • Your spikes will be stored in a centralized folder. To see where this folder is, use getpref('crabsort','store_spikes_here'). To change this location, use setpref('crabsort','store_spikes_here',NEW_LOCATION)
  • The crabsort object is a structure that can be loaded in MATLAB using the command
load( ‘filename….crabsort.’,’-mat’); 

It has a bunch of data within it, and each nerve you sorted spikes on should have its own array of spikes stored. For the spike arrays, each data point is the index number of the spike peak. Multiply by the rate of data acquisition (dt) to get spike times in seconds. E.g, to find the spike times you would do something like this

LP_sptimes = crabsort_obj.spikes.lvn.LP * crabsort_obj.dt;

Data Analysis

To get all data in an experiment, extracting spikes for LP and PD:

data = crabsort.consolidate('neurons',{'LP','PD'});

This syntax will return a vector of structures, each element corresponding to one data file.

You’ll see something like this:

data = 

  1×54 struct array with fields:

    LP
    PD
    time_offset
    T
    experiment_idx
    mask
    filename

If you want all data combined and stacked together, use:

data = crabsort.consolidate('neurons',{'LP','PD'},’stack’,true);

Then you will get this:

data = 

  1×7 struct array with fields:

    LP
    PD
    time_offset
    T
    experiment_idx
    mask
    filename

Why are you still getting a vector of structures instead of one structure? That’s because in this example, some data files are missing, and crabsort can’t know how much time passed in those missing data files. So it breaks it. If all your data is intact, then you will get one data structure.

Metadata

Pulling out raw data