Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Made 3 important hidden parameters visible to the user #809

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

naterenegar
Copy link

Summary

Hello!

In my experience using kilosort, I've had to dig into the code to understand the output I was getting. I found and documented three parameters that impacted the performance of the algorithm: loc_range, long_range, and max_peels. I've tested that these work with the GUI and with the API. Also, I've added a warning at the DEBUG level of the logger during matching pursuit.

Universal Template Parameters

loc_range and long_range control how far to look in channels and time around a spike to determine (1) if it is a maximum (loc_range) and (2) if it is isolated (long_range). These were hard coded to loc_range = [4,5] and long_range = [6,30], but the user should be able to adjust these if (1) their sampling rate demands different ranges of samples and (2) if the organization of channels in their binary is non topographic. That is, adjacent channels in the binary may not be adjacent in the probe, but the peak detection and isolation treats them that way. This is the case for my data, so I've been running the algorithm with the channel range set to 0 to improve the construction of universal templates.

Matching Pursuit: max_peels

The number of required iterations of matching-pursuit to detect all spikes varies depending on the network properties of the recording. In my data, the network oscillates between quiet and high frequency states. During the high frequency states, there are many overlapping spikes that required more than the hard coded 100 iterations of matching pursuit. For my particular data, I increased the iterations to 500 to detect most spikes during these network bursts.

To make this more transparent to other users, I've added this as a parameter and added a warning using logger.debug to warn the user when spikes are detected on the last iteration of matching pursuit.

Modified Files

parameters.py: Added entries for the three parameters
gui/settings_box.py: Added handling for the list format of loc_range and long_range
spikedetect.py: Added code to fetch the loc_range and long_range parameters
template_matching.py: Added code to fetch the max_peels parameter and emit a warning

… when the last iteration of matching pursuit is reached but spikes are still detected, along with a parameter to increase the number of matching pursuit iterations.
…ault parameter settings for loc range and long range
@marius10p
Copy link
Contributor

hi, thanks for the pull request, sorry for the delay. The loc_range and long_range are not very important for the construction of the single-channel waveforms that eventually make up the universal templates. They are used as "interdiction areas" around single-channel spike detection, which is used to extract single-channel PCs and centroids. Even if the channels are in non-topographic order, this will at most prevent a few spikes from being detected if they happen at similar times to other large spikes. This is not a problem: we don't care about detecting every spike in this particular step, and lots of different ways of doing this will result in very similar outcomes with respect to the wiggles that become the PCs and the wTEMP waveforms.

We prefer not exposing these parameters, because it adds to the burden of maintaining exposed parameters and potentially confusing users, and I expect it to have no impact at all. If you have a clear example of this helping on your data, please do show us.

The other change is useful. Maybe we can isolate that to a separate pull request. I would still think that all spikes from a well isolated unit should be detected in the first 100 iterations of "peeling". Do you have a clear example where spikes from a well-isolated single unit were not all found in the first 100 iterations?

@naterenegar
Copy link
Author

Hi, I agree that the universal template parameters aren't that important, especially since the data is already subsampled to construct the templates. The non-topographic issue isn't too bad -- while there will be instances where spikes from far away violate the interdiction area, this won't be all spikes, and the PCA will do just fine. I think the sampling rate issue is slightly more important, but a hardcoded value will likely capture most people's setups in the 20-30 kHz range.

As for the peeling, I can find an example for you and post it later -- in my data, most units fire off a burst of spikes in a 100-300ms window. IIRC the issue was about these "interdiction areas" in the template matching step for exactly this case where many units physically close together fire at a high rate.

Out of curiosity, have you ever seen a case where subtracting a template away creates a detectable spike on the next iteration? This was a worry about increasing the number of iterations. Should be unlikely in an OMP setup, but the templates here aren't necessarily orthogonal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants