Merge pull request #95 from catalystneuro/docs

Enhance docs
catalystneuro · Sep 29, 2024 · b892c50 · b892c50
2 parents 6978fe4 + 7a93e12
commit b892c50
Show file tree

Hide file tree

Showing 2 changed files with 72 additions and 10 deletions.
diff --git a/README.md b/README.md
@@ -8,26 +8,88 @@ This repository houses conversion pipelines for the IBL data releases, including
 
 # Installation
 
-For Brain Wide Map,
-
 ```
 git clone https:/github.com/catalystneuro/IBL-to-nwb
 cd IBL-to-nwb
-pip install -e .[brainwide_map]
+pip install -e .
+```
+
+for the exact environment used for the initial conversion, see `src/ibl_to_nwb/_environments`.
+
+It is recommended to follow a similar approach for future conversions to leave a record of provenance.
+
+
+
+# Running data conversions
+
+## NeuroConv structure
+
+NeuroConv has two primarily classes for handling conversions.
+
+An `Interface` reads a single data stream (such as DLC pose estimation) and creates one or more neurodata objects, adding them to an in-memory `pynwb.NWBFile` object via the `.add_to_nwbfile` method. Before that it can also fetch and set local `metadata: dict` values for use or modification.
+
+The `Converter` orchestrates the conversion by combining multiple interfaces, and can also be used to add additional metadata to the NWB file. It is responsible for creating the NWB file saved to disk.
+
+## Metadata
+
+Anywhere you see handwritten text in the NWB files that is meant to be human-readable, it is likely that it was copied from the public Google IBL documents and written in the `.yaml` files found in `src/ibl_to_nwb/_metadata`.
+
+Occasionally, especially if a portion of the text is pulled from source data, these values might be overwritten in the `.add_to_nwbfile` protocol of an interface, so always be sure to check that as well.
+
+## Raw only
+
+Open the script `src/ibl_to_nwb/_scripts/convert_brainwide_map_raw_only.py`.
+
+Change any values at the top as needed, such as the `session_id` (equivalent to the 'eid' of ONE).
+
+Then run the script.
+
+## Processed only
+
+Open the script `src/ibl_to_nwb/_scripts/convert_brainwide_map_processed_only.py`.
+
+Change any values at the top as needed, such as the `session_id` (equivalent to the 'eid' of ONE).
+
+Then run the script.
+
+
+
+# Upload to DANDI
+
+Set the environment variable `DANDI_API_KEY`, obtainable from clicking on your initials in the top right of https://dandiarchive.org/dandiset.
+
+In an fresh environment, install the DANDI CLI:
+
+```
+pip install dandi
 ```
 
-(not tested on all platforms or Python environments)
+Download a shell of the dandiset:
 
+```
+dandi download DANDI:000409 --download dandiset.yaml
+```
 
+All outputs from the conversion scripts should be pre-organized, so we can just directly move all the `sub-` folders from the conversion output directory into the Dandiset folder. This should appear like:
+
+```
+|- 000409
+|   |- sub-CSH-ZAR-001
+|   |-   |- sub-CSH-ZAR-001_ses-3e7..._desc-processed_behavior+ecephys.nwb
+|   |-   |- sub-CSH-ZAR-001_ses-3e7..._desc-raw_ecephys+image.nwb
+|   |-   |- ...
+|   |- ...
+```
 
-# How to convert processed-only data for BWM
 
-From the first level of the repo as the working directory,
+From a working directory of `000409`, you can either scan for validations directly with:
 
 ```
-python ibl_to_nwb/updated_conversion/brainwide_map/convert_brainwide_map_processed_only_parallel.py
+dandi validate .
 ```
 
-The script contains some values that might want to be changed, such as `number_of_parallel_jobs`, or `base_path` if not running on the DANDI Hub.
+Of course, all assets ought to be valid, so you could also just directly upload the data to DANDI (this will also run validation as it iterates through the files):
 
-The block about skipping sessions already on DANDI would need to be commented out if a 'patch' conversion + new release is being performed.
+```
+dandi upload
+```
diff --git a/src/ibl_to_nwb/datainterfaces/_ibl_sorting_extractor.py b/src/ibl_to_nwb/datainterfaces/_ibl_sorting_extractor.py
@@ -1,4 +1,4 @@
-"""The interface for loadding spike sorted data via ONE access."""
+"""The interface for loading spike sorted data via ONE access."""
 
 from collections import defaultdict
 from typing import Dict, Optional, Union