Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Conversion from Intan ephys - electrode positions not present in NWB file #634

Closed
2 tasks done
magland opened this issue Nov 8, 2023 · 18 comments
Closed
2 tasks done
Labels

Comments

@magland
Copy link
Contributor

magland commented Nov 8, 2023

What happened?

We used the recommended script to convert Intan ephys data to NWB file. As part of the conversion we set the probe information using the following.

probe_path = 'C:\\path\\to\\probe.csv'
probe_df = pd.read_csv(probe_path); probe_df = probe_df.drop(columns = ['Unnamed: 0'])
probe = generate_multi_columns_probe(num_columns = 2)
probe = probe.from_dataframe(probe_df)
...
intan_interface.recording_extractor.set_probe(probe)

The resulting NWB does not have rel_x and rel_y columns in the the general/extracellular_ephys/electrodes table. There is a locations column, but it is filled with ['unknown', 'unknown', ...].

@oghenand Could you please provide the probe.csv file? @CodyCBakerPhD was requesting to see this.

Steps to Reproduce

# See above

Traceback

No response

Operating System

Windows

Python Executable

Conda

Python Version

3.7

Package Versions

This screenshot shows that the probe_df has x and y columns.

image

Code of Conduct

@magland magland added the bug label Nov 8, 2023
@magland
Copy link
Contributor Author

magland commented Nov 8, 2023

@alejoe91 is suggesting to try

intan_interface.recording_extractor.set_probe(probe, in_place=True)

Note: in_place=True

@alejoe91
Copy link
Contributor

alejoe91 commented Nov 9, 2023

@CodyCBakerPhD @bendichter if this turns out to work (I'm 99.8% sure about it), we should add a page on Neuroconv docs to show how to properly add a probeinterface.Probe. Where would that page go?

@magland
Copy link
Contributor Author

magland commented Nov 9, 2023

Yes it seems to have worked... thanks @CodyCBakerPhD and @alejoe91

I'm happy to help with the docs too once you decide where that should go.

@CodyCBakerPhD
Copy link
Member

Where would that page go?

I'd recommend a full-fledged (working) tutorial in the user guide with small note as a reminder on each gallery (not in the code, because then it won't be checked or linked but just outside of the code snippet)

@magland
Copy link
Contributor Author

magland commented Nov 9, 2023

Digging deeper into the NWB file, there's something strange with the 'group' column in the electrodes table. They're using a probeinterface probe that was created from a dataframe with 7 columns... none of them are group... but one is shank_id which takes the values 0, 1, 2, 3. So they have four 32-channel shanks. Do you have any idea where the group [135, 118, 91, 125, 6, 0, 0, 0, 135, 118, 91, 125, 6, 0, 0, 0, ...] (repeats in 8) could be coming from? I wonder if it's part of the Intan conversion. Should I have them rename shank_id to group or something?

And there's also a group_name column with every entry 'A'. Not sure where that's coming from either.

@magland
Copy link
Contributor Author

magland commented Nov 9, 2023

Following up here for future reference... Alessio said we should use group_mode="by_shank" in set_probe()

@CodyCBakerPhD
Copy link
Member

Moving this to #635 since the 'bug' part has been resolved and the bigger issue is lack of documentation

@magland
Copy link
Contributor Author

magland commented Nov 9, 2023

Unfortunately, the group_mode="by_shank" did not do the trick.

@CodyCBakerPhD @alejoe91 should I open a separate issue for this?

Here's what they are doing

interface.recording_extractor.set_probe(probe, in_place=True, group_mode='by_shank')

And the resulting table in the NWB file uploaded to DANDI, as viewed by Neurosift:

image

Notice the weird pattern in the group column. The shank IDs are 0, 1, 2, 3

@CodyCBakerPhD
Copy link
Member

@magland to be clear, there seem to be two things doing on here

  1. neurosift rendering of the 'group' - the 'group' is an internal link to the ElectrodeGroup object, not a string or number - not sure how you're currently grabbing that but you can likely just remove it from the column view here or otherwise follow the link and printout the actual name of the thing it points to

  2. the default intan grouping behavior is by port, hence why all of these are 'A' group - this is the only way to group Intan items from the intan metadata header, but it should ideally be overridden in this case by the probe information, which should alter the group names to match the shank ID or something like that - so I'd consult with @alejoe91 on how to get that to work as intended

@CodyCBakerPhD
Copy link
Member

Also if you want to override the 'location' field on the NWB table, I believe SI annotates that as the 'brain_area' property, so just be sure to set that appropriately (or can it be included on the probe spreadsheet too?)

@magland
Copy link
Contributor Author

magland commented Nov 10, 2023

Thanks for that... although I'm pretty confused since I'm not used to thinking in terms of group, shank, probe, port, etc. In the end we somehow need the NwbRecordingExtractor to have a group property that corresponds to the shank_id provided by the lab. Because then we can do spike sorting by shank. I'm not sure how to get there.

@CodyCBakerPhD
Copy link
Member

In NWB

Probe = Device
Shank = ElectrodeGroup

port nomenclature only comes in w.r.t. reading from certain formats that denote such in their metadata

The 'shank_id' here is SI nomenclature for how probes/recording extractors store things in their API, but what matters is the SI group (group_name?) property, which in this example you are trying to equivocate with the 'shank_id', so just need to figure out how to get that group to set properly on the SI object and the NWB conversion should reflect it

@magland
Copy link
Contributor Author

magland commented Nov 10, 2023

Thanks @CodyCBakerPhD for the explanations. Also, you are right about the group column in Neurosift - the periodicity of 8 is just from 8 bytes per pointer - and neurosift somehow was displaying as a byte array.

I saw this code in the IntanRecordingInterface

# Add electrode group
unique_group_name = set(self.recording_extractor.get_property("group_name"))
electrode_group_list = [
dict(
name=group_name,
description=f"Group {group_name} electrodes.",
device="Intan",
location="",
)
for group_name in unique_group_name
]
ecephys_metadata.update(ElectrodeGroup=electrode_group_list)

which means that it's looking for the group_name property on the recording extractor. @alejoe91 I need help figuring out why this is apparently not set properly on the recording extractor after the set_probe() call. I don't have the data myself, @oghenand has the data and has been working on this.

@alejoe91
Copy link
Contributor

@CodyCBakerPhD we found the issue: the IntanRecordingInterface.extract_electrode_metadata() sets the group_name property of the self.recording_exctractor. When loading and setting a probe on the SI side, the group property is set. However, the add_electrode_groups has a precedence on the group_name over the group property to define electrode groups (see here). Therefore, the probe group is ignored.

The immediate solution would be to overwrite the group_name property as well:

intan_interface.recording_extractor.set_property("group_name", intan_interface.recording_extractor.get_property("group").astype(str))

But this is probably not what we want an end user to do :)

Other better solutions:

  1. Have the extract_electrode_metadata set the group property, so that could easily be overwritten by attaching a probe to the extractor

  2. (better) after discusing with @magland we thought that the BaseRecordingExtractorInterface could have a set_probe method that:

  • sets a probe in_place
  • sets the group_name property automatically

Option 2 will also make the API for adding a probe more direct, from intan_interface.recording_extractor.set_probe(...) to:

intan_interface.set_probe(...)

What do you guys think?

@alejoe91 alejoe91 reopened this Nov 10, 2023
@CodyCBakerPhD
Copy link
Member

Yes, set_probe and set_probe_from_file convenience would both be fine to add to the base recording interface

I believe the group_name precedence reasoning was because many recording extractors (across all formats) had minimal identifiers from the old days when groups in SI were just integers, not full strings - and various formats had rich metadata from the headers that could allow for richer string annotation of the group names (post extractor initialization) at which point it was easier to add the group_name field to metadata

I've not checked to see if modern group names have improved, in which case we can get rid of the workaround

To get the immediate conversion working, I'd recommend either

  • @alejoe91 suggestion by overriding the group_name to the expected value
  • or much simpler, just delete the group_name property in which case it will then default to the existing group values

Long term, having one or two lines in the set_probe convenience methods to wipe the previous group name information from the __init__ sounds like the easiest approach

pinging @h-mayorquin on this since they might remember more (or be fresher in their minds) - they were the most recent to touch that code in 92af17b

@CodyCBakerPhD
Copy link
Member

I believe this was fixed by your slew of PRs at the time, correct @magland?

@h-mayorquin
Copy link
Collaborator

Yes, I duble checked and the code discussed is already there:

def set_probe(self, probe, group_mode: Literal["by_shank", "by_probe"]):
    """
    Set the probe information via a ProbeInterface object.

    Parameters
    ----------
    probe : probeinterface.Probe
        The probe object.
    group_mode : {'by_shank', 'by_probe'}
        How to group the channels. If 'by_shank', channels are grouped by the shank_id column.
        If 'by_probe', channels are grouped by the probe_id column.
        This is a required parameter to avoid the pitfall of using the wrong mode.
    """
    # Set the probe to the recording extractor
    self.recording_extractor.set_probe(
        probe,
        in_place=True,
        group_mode=group_mode,
    )
    # Spike interface sets the "group" property
    # But neuroconv allows "group_name" property to override spike interface "group" value
    self.recording_extractor.set_property("group_name", self.recording_extractor.get_property("group").astype(str))

There is still the fact that the problem occurs when someone uses our tools.spikeinterface function but let me open another issue for that.

@h-mayorquin
Copy link
Collaborator

Closing this as the remaining issue can be addressed in #932

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants