diff --git a/.gitignore b/.gitignore index fac0f30..b6ec95d 100644 --- a/.gitignore +++ b/.gitignore @@ -4,6 +4,9 @@ # generated docs docs/source/_format_auto_docs +# developer-specific files +.vscode/ + # Byte-compiled / optimized / DLL files __pycache__/ *.py[cod] diff --git a/CHANGELOG.md b/CHANGELOG.md index 448372a..f7ae2ee 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,5 +1,3 @@ # Changelog for ndx-events -## 0.3.0 (Upcoming) - - +## 0.4.0 (Upcoming) diff --git a/README.md b/README.md index edf5145..0f571a0 100644 --- a/README.md +++ b/README.md @@ -1,58 +1,63 @@ # ndx-events Extension for NWB -This is an NWB extension for storing timestamped event data and TTL pulses. - -The latest version is 0.3.0. This is a major change from previous versions. - -**`EventTypesTable`**: Event types (e.g., lick, reward left, reward right, airpuff, reach) and metadata about them should be stored in an `EventTypesTable` object. -- `EventTypesTable` inherits from `DynamicTable` and stores metadata related to each event type, one per row. -- An "event_name" text column is required. -- A "event_type_description" text column is required. -- The table allows for an arbitrary number of custom columns to be added for additional metadata for each event type. -- This table is intended to live in a `Task` object at the path "general/task" in the `NWBFile`. - -**`EventsTable`**: Event times and metadata about them should be stored in an `EventsTable` object. -- `EventsTable` inherits from `DynamicTable` and stores metadata related to each event time / instance, one per row. -- A "timestamp" column of type `TimestampVectorData` is required. -- A “duration” column of type `DurationVectorData` is optional. -- An “event_type” column that is a foreign key reference to a row index of the `EventTypesTable` is required. -- A "value" text column is optional. This enables storage of another layer of events within an event type. This could store different reward sizes or different tone frequencies or other parameterizations of an event. For example, if you have three levels of reward (e.g., 1 drop, 2 drops, 3 drops), instead of encoding each level of reward as its own event type (e.g., "reward_value_1", "reward_value_2", "reward_value_3", you could encode "reward" as the event type, and the value for each event time could be "1", "2", or "3". -- Because this inherits from `DynamicTable`, users can add additional custom columns to store other metadata. -- This table is intended to live either under the "acquisition" group or in a "behavior" `ProcessingModule`, i.e., under the "processing/behavior" group. - -**`TtlTypesTable`**: TTL pulse types and metadata about them should be stored in a `TtlTypesTable` object. -- `TtlTypesTable` inherits from `EventTypesTable` and stores metadata related to each TTL pulse type, one per row. -- A "pulse_value" unsigned integer column is required. -- This table is intended to live in a `Task` object at the path "general/task" in the `NWBFile`. - -**`TtlsTable`**: TTL pulses and metadata about them should be stored in a `TtlsTable` object. -- `TtlsTable` inherits from `EventsTable`. -- The "event_type" column inherited from `EventsTable` should refer to the `TtlTypesTable`. -- This table is intended to live either under the "acquisition" group or in a "behavior" `ProcessingModule`, i.e., under the "processing/behavior" group. - -This extension defines a few additional neurodata types related to storing events: - -**`Task`**: `Task` type is a subtype of the `LabMetaData` type and holds the `EventTypesTable` and `TtlTypesTable`. This allows the `Task` type to be added as a group in the root "general" group. +This is an NWB extension for storing timestamped event data. + +The latest version is 0.4.0. This is a major change from previous versions. + +1. A `TimestampVectorData` type that extends `VectorData` and stores a 1D array of timestamps (float32) in seconds + - Values are in seconds from session start time (like all other timestamps in NWB) + - It has a scalar string attribute named "unit". The value of the attribute is fixed to "seconds". + - It has an optional scalar float attribute named "resolution" that represents the smallest possible difference between two timestamps. This is usually 1 divided by the sampling rate for timestamps of the data acquisition system. (Alternatively, the event sampling rate could be stored.) + - This type can be used to represent a column of timestamps in any `DynamicTable`, such as the NWB `Units` table and the new `EventsTable` described below. +2. A `DurationVectorData` type that extends `VectorData` and stores a 1D array of durations (float32) in seconds. It is otherwise identical to the `TimestampVectorData` type. + - If this is used in a table where some events have a duration and some do not (or it is not known yet), then a value of NaN can be used for events without a duration or with a duration that is not yet specified. If the latter, the mapping should be documented in the description of the `DurationVectorData`. +3. A `CategoricalVectorData` type that extends `VectorData` and stores the mappings of data values (of any type) to meanings. This is an experimental type to evaluate one possible way of storing the meanings (longer descriptions) associated with different categorical values stored in a table column. This can be used to add categorical metadata values to an `EventsTable`. This type will be marked as experimental while the NWB team evaluates possible alternate solutions to annotating the values of a dataset, such as LinkML-based term sets, non-table based approaches, and external mapping files. + - The type contains an object reference to a `MeaningsTable` named "meanings". See below. Unfortunately, because `CategoricalVectorData` is a dataset, it cannot contain a `MeaningsTable` within it, so the `MeaningsTable` is placed in the parent `EventsTable` and referenced by the `CategoricalVectorData`. + - It may also contain an optional 1D attribute named "filter_values" to define missing and invalid values within a data field to be filtered out during analysis, e.g., the dataset may contain one or more of: "undefined" or "None" to signal that those values in the `CategoricalVectorData` dataset are missing or invalid. Due to constraints of NWB/HDMF attributes, attributes must have a dtype, so currently, only string values (not -1 or NaN) are allowed. + - This type is similar to an `EnumData`, which is a `VectorData` of an enumerated type, except that the values stored in the column are strings that are short-hand representations of the concept, as opposed to integers. Storing strings is slightly less efficient than storing integers, but for these use cases, these tables will rarely be large and storing strings directly is more intuitive and accessible to users. +4. A `MeaningsTable` type that extends `DynamicTable` with two required columns: + - A "value" column that contains all the possible values that could be stored in the parent `CategoricalVectorData` object. For example, if the `CategoricalVectorData` stores the port in which the subject performed a nose poke, the possible values might be "left", "center", and "right". All possible values must be listed, even if not all values are observed, e.g., if the subject does not poke in the "center" port, "center" should still be listed to signal that it was a possible option. + - A "meaning" column with string dtype that contains a longer description of the concept. For example, for the value "left", the meaning might be "The subject performed a nosepoke in the left-most port, from the viewpoint of the subject. This is signaled by detection of the port’s infrared beam being broken." + - Users can add custom, user-defined columns to provide additional information about the possible values, such as [HED (Hierarchical Event Descriptor)](https://www.hed-resources.org/en/latest/) tags. For HED tags, users may consider using the `HedTags` type, a subtype of `VectorData`, in the [ndx-hed extension](https://github.com/hed-standard/ndx-hed). + - As described in `CategoricalVectorData`, this arrangement will be marked as experimental. +5. An `EventsTable` type for storing a collection of event times that have the same parameterizations/properties/metadata (i.e., they are the same type of event, such as licks, image presentations, or reward deliveries) + - It inherits from `DynamicTable` and stores metadata related to each event time / instance, one per row. + - It has a "timestamp" column of type `TimestampVectorData` is required. + - It has a "duration" column of type `DurationVectorData` is optional. + - Because this inherits from `DynamicTable`, users can add additional custom columns to store other metadata, such as parameterizations of an event, e.g., reward value in uL, image category, or tone frequency. + - The "description" of this table should include information about how the event times were computed, especially if the times are the result of processing or filtering raw data. For example, if the experimenter is encoding different types of events using a "strobed" or "N-bit" encoding, then the "description" value should describe which channels were used and how the event time is computed, e.g., as the rise time of the first bit. + - It contains a collection of `MeaningsTable` objects referenced by any `CategoricalVectorData` columns. These columns are placed in a subgroup of the EventsTable named "meanings". Alternatively, these `MeaningsTable` objects could be placed under the root `NWBFile`, but it is probably more useful to keep them close to the objects that they describe. As described in `CategoricalVectorData`, this arrangement will be marked as experimental. + +The PyNWB and MatNWB APIs would provide functions to create these tables. For example, in PyNWB: + +```python +stimulus_presentation_events = EventsTable(name="stimulus_presentation_events") +stimulus_presentation_events.add_column("stimulus_type", col_cls=CategoricalVectorData) +stimulus_presentation_events.add_row(timestamp=1.0, stimulus_type="circle") +stimulus_presentation_events.add_row(timestamp=4.5, stimulus_type="square") +nwbfile.add_events_table(stimulus_presentation_events) +``` -**`TimestampVectorData`**: The `TimestampVectorData` type stores a 1D array of timestamps in seconds. -- Values are in seconds from session start time. -- It has a "unit" attribute. The value of the attribute is fixed to "seconds". -- It has a "resolution" attribute that represents the smallest possible difference between two timestamps. Usually 1 divided by the sampling rate for timestamps of the data acquisition system. +The APIs would also provide the following interfaces: +- `nwbfile.events_tables` returns a dictionary of `EventsTable` objects, similar to `nwbfile.acquisition` +- Use `nwbfile.events_tables["stimulus_presentation_events"]` to access an `EventsTable` by name +- `nwbfile.merge_events_tables(tables: list[EventsTable])`, which merges a selection of `EventsTable` objects into a read-only table, sorted by timestamp +- `nwbfile.get_all_events()`, which merges all the `EventsTable` objects into one read-only table, sorted by timestamp -**`DurationVectorData`**: The `DurationVectorData` type that stores a 1D array of durations in seconds. -- It is otherwise identical to the `TimestampVectorData` type. +This extension was developed by Ryan Ly, Oliver Rübel, the NWB Technical Advisory Board, and the NWBEP001 Review Working Group. -This extension was developed by Ryan Ly, Oliver Rübel, and the NWB Technical Advisory Board. Information about the rationale, background, and alternative approaches to this extension can be found here: https://docs.google.com/document/d/1qcsjyFVX9oI_746RdMoDdmQPu940s0YtDjb1en1Xtdw ## Installation -The latest **ndx-events 0.3.0** has not yet been released on PyPI. To install it on Python, use: +The latest **ndx-events 0.4.0** has not yet been released on PyPI. To install it on Python, use: ```bash pip install git+https://github.com/rly/ndx-events.git ``` +ndx-events 0.3.0 was not released on PyPI. + To install the 0.2.0 version, use: Python: ```bash @@ -64,7 +69,13 @@ Matlab: generateExtension('/ndx-events/spec/ndx-events.namespace.yaml'); ``` +## Usage examples + +1. [Example writing TTL pulses and stimulus presentations to an NWB file](examples/write_ttls_events.py). + + ## Developer installation + In a Python 3.8-3.12 environment: ```bash pip install -r requirements-dev.txt diff --git a/docs/source/conf.py b/docs/source/conf.py index ac8c925..a49c8c0 100644 --- a/docs/source/conf.py +++ b/docs/source/conf.py @@ -10,7 +10,7 @@ copyright = "2024, Ryan Ly" author = "Ryan Ly" -version = "0.3.0" +version = "0.4.0" release = "alpha" # -- General configuration --------------------------------------------------- diff --git a/examples/write_ttls_events.py b/examples/write_ttls_events.py new file mode 100644 index 0000000..a67ecb0 --- /dev/null +++ b/examples/write_ttls_events.py @@ -0,0 +1,178 @@ +""" +Example script that demonstrates how to write an EventsTable with a CategoricalVectorData and associated MeaningsTable +to store raw TTL pulses received by the acquisition system and processed stimulus presentation events. +""" + +from datetime import datetime +from pynwb import NWBHDF5IO + +from ndx_events import ( + EventsTable, + CategoricalVectorData, + MeaningsTable, + NdxEventsNWBFile, +) + +nwbfile = NdxEventsNWBFile( + session_description="session description", + identifier="cool_experiment_001", + session_start_time=datetime.now().astimezone(), +) + +# In this experiment, TTL pulses were sent by the stimulus computer +# to signal important time markers during the experiment/trial, +# when the stimulus was placed on the screen and removed from the screen, +# when the question appeared, and the responses of the subject. + +# ref: https://www.nature.com/articles/s41597-020-0415-9, DANDI:000004 + +# We will first create an EventsTable to store the raw TTL pulses received by the acquisition system. +# Storing the raw TTL pulses is not necessary, but it can be useful for debugging and understanding the experiment. +# The data curator could +# Before doing so, we will create a CategoricalVectorData column for the possible integer values for the TTL pulse +# and associate it with a MeaningsTable that describes the meaning of each value. + +pulse_value_meanings_table = MeaningsTable( + name="pulse_value_meanings", description="The meanings of each integer value for a TTL pulse." +) +pulse_value_meanings_table.add_row(value=55, meaning="Start of experiment") +pulse_value_meanings_table.add_row(value=1, meaning="Stimulus onset") +pulse_value_meanings_table.add_row(value=2, meaning="Stimulus offset") +pulse_value_meanings_table.add_row(value=3, meaning="Question screen onset") + +yes_animal_response_description = ( + "During the learning phase, subjects are instructed to respond to the following " + "question: 'Is this an animal?' in each trial. The response is 'Yes, this is an animal'." +) +no_animal_response_description = ( + "During the learning phase, subjects are instructed to respond to the following " + "question: 'Is this an animal?' in each trial. The response is 'No, this is not an animal'." +) +pulse_value_meanings_table.add_row(value=20, meaning=yes_animal_response_description) +pulse_value_meanings_table.add_row(value=21, meaning=no_animal_response_description) + +new_confident_response_description = ( + "During the recognition phase, subjects are instructed to respond to the following " + "question: 'Have you seen this image before?' in each trial. The response is 'New, confident'." +) +new_probably_response_description = ( + "During the recognition phase, subjects are instructed to respond to the following " + "question: 'Have you seen this image before?' in each trial. The response is 'New, probably'." +) +new_guess_response_description = ( + "During the recognition phase, subjects are instructed to respond to the following " + "question: 'Have you seen this image before?' in each trial. The response is 'New, guess'." +) +old_guess_response_description = ( + "During the recognition phase, subjects are instructed to respond to the following " + "question: 'Have you seen this image before?' in each trial. The response is 'Old, guess'." +) +old_probably_response_description = ( + "During the recognition phase, subjects are instructed to respond to the following " + "question: 'Have you seen this image before?' in each trial. The response is 'Old, probably'." +) +old_confident_response_description = ( + "During the recognition phase, subjects are instructed to respond to the following " + "question: 'Have you seen this image before?' in each trial. The response is 'Old, confident'." +) + +pulse_value_meanings_table.add_row(value=31, meaning=new_confident_response_description) +pulse_value_meanings_table.add_row(value=32, meaning=new_probably_response_description) +pulse_value_meanings_table.add_row(value=33, meaning=new_guess_response_description) +pulse_value_meanings_table.add_row(value=34, meaning=old_guess_response_description) +pulse_value_meanings_table.add_row(value=35, meaning=old_probably_response_description) +pulse_value_meanings_table.add_row(value=36, meaning=old_confident_response_description) + +pulse_value_meanings_table.add_row(value=6, meaning="End of trial") +pulse_value_meanings_table.add_row(value=66, meaning="End of experiment") + +pulse_value_column = CategoricalVectorData( + name="pulse_value", description="Integer value of the TTL pulse", meanings=pulse_value_meanings_table +) + +ttl_events_table = EventsTable( + name="ttl_events", + description="TTL events", + columns=[pulse_value_column], + meanings_tables=[pulse_value_meanings_table], +) +ttl_events_table.add_row( + timestamp=6820.092244, + pulse_value=55, +) +ttl_events_table.add_row( + timestamp=6821.208244, + pulse_value=1, +) +ttl_events_table.add_row( + timestamp=6822.210644, + pulse_value=2, +) +ttl_events_table.add_row( + timestamp=6822.711364, + pulse_value=3, +) +ttl_events_table.add_row( + timestamp=6825.934244, + pulse_value=31, +) +ttl_events_table.timestamp.resolution = 1 / 50000.0 # specify the resolution of the timestamps (optional) + +# The data curator may want to create an EventsTable to store more processed information than the TTLs table +# e.g., converting stimulus onset and offset into a single stimulus event with metadata. +# This may be redundant with information in the trials table if the task is +# structured into trials. + +stimulus_category_meanings_table = MeaningsTable( + name="stimulus_category_meanings", description="The meanings of each stimulus category" +) +stimulus_category_meanings_table.add_row(value="smallAnimal", meaning="An image of a small animal was presented.") +stimulus_category_meanings_table.add_row(value="largeAnimal", meaning="An image of a large animal was presented.") + +stimulus_category_column = CategoricalVectorData( + name="stimulus_category", description="The category of the stimulus", meanings=stimulus_category_meanings_table +) + +stimulus_presentation_table = EventsTable( + name="stimulus_presentations", + description="Metadata about stimulus presentations", + columns=[stimulus_category_column], + meanings_tables=[stimulus_category_meanings_table], +) +stimulus_presentation_table.add_column( + name="stimulus_image_index", description="Frame index of the stimulus image in the StimulusPresentation object" +) # this is an integer. +# One could make this a CategoricalVectorData column if there are a limited number of stimulus images and one +# wants to describe each one + +stimulus_presentation_table.add_row( + timestamp=6821.208244, + duration=1.0024, # this comes from the stimulus onset and offset TTLs + stimulus_category="smallAnimal", + stimulus_image_index=0, +) +stimulus_presentation_table.add_row( + timestamp=6825.208244, + duration=0.99484, + stimulus_category="phones", + stimulus_image_index=1, +) +stimulus_presentation_table.timestamp.resolution = 1 / 50000.0 # specify the resolution of the timestamps (optional) +stimulus_presentation_table.duration.resolution = 1 / 50000.0 # specify the resolution of the durations (optional) + +nwbfile.add_events_table(ttl_events_table) +nwbfile.add_events_table(stimulus_presentation_table) + +print(nwbfile.get_all_events()) + +# Write NWB file. +filename = "test_events.nwb" +with NWBHDF5IO(filename, "w") as io: + io.write(nwbfile) + +# Read NWB file and check its contents. +with NWBHDF5IO(filename, "r", load_namespaces=True) as io: + read_nwbfile = io.read() + print(read_nwbfile) + print(read_nwbfile.events["ttl_events"].to_dataframe()) + print(read_nwbfile.events["stimulus_presentations"].to_dataframe()) diff --git a/pyproject.toml b/pyproject.toml index 44b6c3c..908cfb0 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -4,7 +4,7 @@ build-backend = "hatchling.build" [project] name = "ndx-events" -version = "0.3.0" +version = "0.4.0" authors = [ { name="Ryan Ly", email="rly@lbl.gov" } ] @@ -108,6 +108,7 @@ line-length = 120 "src/pynwb/ndx_events/__init__.py" = ["E402", "F401"] "src/spec/create_extension_spec.py" = ["T201"] "src/pynwb/tests/test_example_usage.py" = ["T201"] +"examples/*" = ["T201"] [tool.ruff.lint.mccabe] max-complexity = 17 diff --git a/requirements-dev.txt b/requirements-dev.txt index 07e3a8b..6c5ed85 100644 --- a/requirements-dev.txt +++ b/requirements-dev.txt @@ -7,7 +7,7 @@ hdmf==3.14.4 hdmf-docutils==0.4.7 pre-commit==3.5.0 # latest pre-commit does not support py3.8 pynwb==2.8.2 -pytest==8.2.2 +pytest==8.3.3 pytest-cov==5.0.0 pytest-subtests==0.12.1 python-dateutil==2.8.2 diff --git a/spec/ndx-events.extensions.yaml b/spec/ndx-events.extensions.yaml index 56a3505..0faedac 100644 --- a/spec/ndx-events.extensions.yaml +++ b/spec/ndx-events.extensions.yaml @@ -35,88 +35,89 @@ datasets: doc: The smallest possible difference between two timestamps. Usually 1 divided by the sampling rate for timestamps of the data acquisition system. required: false +- neurodata_type_def: CategoricalVectorData + neurodata_type_inc: VectorData + dims: + - num_events + shape: + - null + doc: A 1-dimensional VectorData that stores categorical data of any type. This is + an experimental type. + attributes: + - name: meanings + dtype: + target_type: MeaningsTable + reftype: object + doc: The MeaningsTable object that provides the meanings of the values in this + CategoricalVectorData object. + - name: filter_values + dtype: text + dims: + - num_events + shape: + - null + doc: Optional dataset containing possible values in the parent data that represent + missing or invalid values that should be filtered out during analysis. Currently, + only string values are allowed. For example, the filter values may contain the + values "undefined" or "None" to signal that those values in the data are missing + or invalid. + required: false groups: -- neurodata_type_def: EventTypesTable +- neurodata_type_def: MeaningsTable neurodata_type_inc: DynamicTable - default_name: EventTypesTable - doc: A column-based table to store information about each event type, such as name, - one event type per row. + doc: A table to store information about the meanings of categorical data. Intended + to be used as a lookup table for the meanings of values in a CategoricalVectorData + object. All possible values of the parent CategoricalVectorData object should + be present in the 'value' column of this table, even if the value is not observed + in the data. Additional columns may be added to store additional metadata about + each value. datasets: - - name: event_name + - name: value neurodata_type_inc: VectorData - dtype: text - doc: Name of each event type. - - name: event_type_description + doc: The value of the parent CategoricalVectorData object. + - name: meaning neurodata_type_inc: VectorData dtype: text - doc: Description of each event type. + doc: The meaning of the value in the parent CategoricalVectorData object. - neurodata_type_def: EventsTable neurodata_type_inc: DynamicTable - default_name: EventsTable doc: A column-based table to store information about events (event instances), one - event per row. Each event must have an event_type, which is a reference to a row - in the EventTypesTable. Additional columns may be added to store metadata about - each event, such as the duration of the event, or a text value of the event. + event per row. Additional columns may be added to store metadata about each event, + such as the duration of the event. + attributes: + - name: description + dtype: text + doc: A description of the events stored in the table, including information about + how the event times were computed, especially if the times are the result of + processing or filtering raw data. For example, if the experimenter is encoding + different types of events using a strobed or N-bit encoding, then the description + should describe which channels were used and how the event time is computed, + e.g., as the rise time of the first bit. datasets: - name: timestamp neurodata_type_inc: TimestampVectorData - doc: The time that each event occurred, in seconds, from the session start time. - - name: event_type - neurodata_type_inc: DynamicTableRegion - dims: - - num_events - shape: - - null - doc: The type of event that occurred. This is represented as a reference to a - row of the EventTypesTable. - quantity: '?' + doc: Column containing the time that each event occurred, in seconds, from the + session start time. - name: duration neurodata_type_inc: DurationVectorData - doc: Optional column containing the duration of each event, in seconds. + doc: Optional column containing the duration of each event, in seconds. A value + of NaN can be used for events without a duration or with a duration that is + not yet specified. quantity: '?' - - name: value - neurodata_type_inc: VectorData - doc: Optional column containing a value/parameter associated with each event. - For example, if you have three levels of reward (e.g., 1 drop, 2 drops, 3 drops), - instead of encoding each level of reward as its own event type (e.g., 'reward_value_1', - 'reward_value_2', 'reward_value_3', you could encode 'reward' as the event type, - and the value for each event time could be 1, 2, or 3. - quantity: '?' -- neurodata_type_def: TtlTypesTable - neurodata_type_inc: EventTypesTable - default_name: TtlTypesTable - doc: A column-based table to store information about each TTL type, such as name - and pulse value, one TTL type per row. - datasets: - - name: pulse_value - neurodata_type_inc: VectorData - dtype: uint8 - doc: TTL pulse value for each event type. -- neurodata_type_def: TtlsTable - neurodata_type_inc: EventsTable - default_name: TtlsTable - doc: Data type to hold timestamps of TTL pulses. - datasets: - - name: ttl_type - neurodata_type_inc: DynamicTableRegion - dims: - - num_events - shape: - - null - doc: The type of TTL that occurred. This is represented as a reference to a row - of the TtlTypesTable. -- neurodata_type_def: Task - neurodata_type_inc: LabMetaData - name: task - doc: A group to store task-related general metadata. TODO When merged with core, - this will no longer inherit from LabMetaData but from NWBContainer and be placed - optionally in /general. groups: - - name: event_types - neurodata_type_inc: EventTypesTable - doc: Table to store information about each task event type. - quantity: '?' - - name: ttl_types - neurodata_type_inc: TtlTypesTable - doc: Table to store information about each task TTL type. - quantity: '?' + - neurodata_type_inc: MeaningsTable + doc: Lookup tables for the meanings of the values in any CategoricalVectorData + columns. The name of the table should be the name of the corresponding CategoricalVectorData + column followed by "_meanings". + quantity: '*' +- neurodata_type_def: NdxEventsNWBFile + neurodata_type_inc: NWBFile + doc: An extension to the NWBFile to store event data. After integration of ndx-events + with the core schema, the NWBFile schema should be updated to this type. + groups: + - name: events + doc: Events that occurred during the session. + groups: + - neurodata_type_inc: EventsTable + doc: Events that occurred during the session. + quantity: '*' diff --git a/spec/ndx-events.namespace.yaml b/spec/ndx-events.namespace.yaml index ee73a13..43532c4 100644 --- a/spec/ndx-events.namespace.yaml +++ b/spec/ndx-events.namespace.yaml @@ -8,4 +8,4 @@ namespaces: schema: - namespace: core - source: ndx-events.extensions.yaml - version: 0.3.0 + version: 0.4.0 diff --git a/src/pynwb/ndx_events/__init__.py b/src/pynwb/ndx_events/__init__.py index 7f462bc..873d27a 100644 --- a/src/pynwb/ndx_events/__init__.py +++ b/src/pynwb/ndx_events/__init__.py @@ -1,5 +1,5 @@ import os -from pynwb import load_namespaces, get_class +from pynwb import load_namespaces try: from importlib.resources import files @@ -19,13 +19,18 @@ load_namespaces(str(__spec_path)) # Define the new classes -Task = get_class("Task", "ndx-events") -TimestampVectorData = get_class("TimestampVectorData", "ndx-events") -DurationVectorData = get_class("DurationVectorData", "ndx-events") -EventTypesTable = get_class("EventTypesTable", "ndx-events") -EventsTable = get_class("EventsTable", "ndx-events") -TtlTypesTable = get_class("TtlTypesTable", "ndx-events") -TtlsTable = get_class("TtlsTable", "ndx-events") +from .events import ( + TimestampVectorData, + DurationVectorData, + CategoricalVectorData, + MeaningsTable, + EventsTable, + NdxEventsNWBFile, +) + + +from .ndx_events_nwb_file_io import NdxEventsNWBFileMap + # Remove these functions from the package -del load_namespaces, get_class +del load_namespaces diff --git a/src/pynwb/ndx_events/events.py b/src/pynwb/ndx_events/events.py new file mode 100644 index 0000000..6d9e081 --- /dev/null +++ b/src/pynwb/ndx_events/events.py @@ -0,0 +1,55 @@ +from pynwb import get_class, register_class, NWBFile +from hdmf.utils import docval, get_docval +import pandas as pd + + +TimestampVectorData = get_class("TimestampVectorData", "ndx-events") +DurationVectorData = get_class("DurationVectorData", "ndx-events") +CategoricalVectorData = get_class("CategoricalVectorData", "ndx-events") +MeaningsTable = get_class("MeaningsTable", "ndx-events") +EventsTable = get_class("EventsTable", "ndx-events") + + +# Replace the __getitem__ method with a custom one from DynamicTable instead of the one from MultiContainerInterface +# NOTE: When the NWBEP001 is merged into the core NWB schema and software, this class will be explicitly defined +# in PyNWB and will use the following __getitem__ method. +def __new_getitem__(self, key): + """Get the table row, column, or selection of cells with the given name.""" + ret = self.get(key) + if ret is None: + raise KeyError(key) + return ret + + +EventsTable.__getitem__ = __new_getitem__ +del __new_getitem__ + + +# NOTE: When the NWBEP001 is merged into the core NWB schema and software, this class will be merged +# with the core NWBFile class. +@register_class("NdxEventsNWBFile", "ndx-events") +class NdxEventsNWBFile(NWBFile): + __clsconf__ = [ + { + "attr": "events", + "add": "add_events_table", + "type": EventsTable, + "create": "create_events_table", + "get": "get_events_table", + }, + ] + + @docval( + *get_docval(NWBFile.__init__), + {"name": "events", "type": (list, tuple), "doc": "Any EventsTable tables storing events", "default": None}, + ) + def __init__(self, **kwargs): + events = kwargs.pop("events", None) + super().__init__(**kwargs) + self.events = events + + def merge_events_tables(self, tables: list[EventsTable]): + return pd.concat([table.to_dataframe().set_index("timestamp") for table in tables], sort=True) + + def get_all_events(self): + return self.merge_events_tables(list(self.events.values())) diff --git a/src/pynwb/ndx_events/ndx_events_nwb_file_io.py b/src/pynwb/ndx_events/ndx_events_nwb_file_io.py new file mode 100644 index 0000000..b293fa3 --- /dev/null +++ b/src/pynwb/ndx_events/ndx_events_nwb_file_io.py @@ -0,0 +1,17 @@ +from pynwb import register_map +from pynwb.io.file import NWBFileMap +from .events import NdxEventsNWBFile + + +# NOTE: When the NWBEP001 is merged into the core NWB schema and software, this class will be merged +# with the core NWBFileMap class. +@register_map(NdxEventsNWBFile) +class NdxEventsNWBFileMap(NWBFileMap): + + def __init__(self, spec): + super().__init__(spec) + + # Map the "events" attribute on the NdxEventsNWBFile class to the EventsTable class + events_spec = self.spec.get_group("events") + self.unmap(events_spec) + self.map_spec("events", events_spec.get_neurodata_type("EventsTable")) diff --git a/src/pynwb/tests/test_events.py b/src/pynwb/tests/test_events.py index cc400dd..7ec5dc1 100644 --- a/src/pynwb/tests/test_events.py +++ b/src/pynwb/tests/test_events.py @@ -1,17 +1,16 @@ +from datetime import datetime from hdmf.common import DynamicTable -import numpy as np from pynwb import NWBHDF5IO from pynwb.testing import TestCase, remove_test_file from pynwb.testing.mock.file import mock_NWBFile from ndx_events import ( EventsTable, - EventTypesTable, - TtlsTable, - TtlTypesTable, - Task, + CategoricalVectorData, + MeaningsTable, DurationVectorData, TimestampVectorData, + NdxEventsNWBFile, ) @@ -123,21 +122,25 @@ def test_roundtrip(self): assert read_col[0] == 0.1 -class TestTask(TestCase): +class TestMeaningsTable(TestCase): def test_init(self): - task = Task() - assert task.name == "task" + meanings_table = MeaningsTable( + name="x_meanings", description="Meanings for values in a CategoricalVectorData object." + ) + assert meanings_table.name == "x_meanings" + assert meanings_table.description == "Meanings for values in a CategoricalVectorData object." - def test_add_to_nwbfile(self): - nwbfile = mock_NWBFile() - task = Task() - nwbfile.add_lab_meta_data(task) - assert nwbfile.get_lab_meta_data("task") is task - assert nwbfile.lab_meta_data["task"] is task + def test_add_row(self): + meanings_table = MeaningsTable( + name="x_meanings", description="Meanings for values in a CategoricalVectorData object." + ) + meanings_table.add_row(value="cue on", meaning="Times when the cue was on screen.") + assert meanings_table["value"].data == ["cue on"] + assert meanings_table["meaning"].data == ["Times when the cue was on screen."] -class TestTaskSimpleRoundtrip(TestCase): - """Simple roundtrip test for Task.""" +class TestMeaningsTableSimpleRoundtrip(TestCase): + """Simple roundtrip test for MeaningsTable.""" def setUp(self): self.path = "test.nwb" @@ -147,176 +150,130 @@ def tearDown(self): def test_roundtrip(self): """ - Create a Task, write it to file, read the file, and test that the read object matches the original. + Create a MeaningsTable, write it to file, read the file, and test that the read object matches the + original. """ - task = Task() + meanings_table = MeaningsTable(name="x_meanings", description="Test meanings table.") + meanings_table.add_row(value="cue on", meaning="Times when the cue was on screen.") + meanings_table.add_row(value="cue off", meaning="Times when the cue was off screen.") + + # place the meanings table in the acquisition group for testing purposes nwbfile = mock_NWBFile() - nwbfile.add_lab_meta_data(task) + nwbfile.add_acquisition(meanings_table) with NWBHDF5IO(self.path, mode="w") as io: io.write(nwbfile) with NWBHDF5IO(self.path, mode="r", load_namespaces=True) as io: read_nwbfile = io.read() - assert isinstance(read_nwbfile.get_lab_meta_data("task"), Task) - assert read_nwbfile.get_lab_meta_data("task").name == "task" - assert read_nwbfile.lab_meta_data["task"].name == "task" + read_meanings_table = read_nwbfile.acquisition["x_meanings"] + assert isinstance(read_meanings_table, MeaningsTable) + assert read_meanings_table.name == "x_meanings" + assert read_meanings_table.description == "Test meanings table." + assert all(read_meanings_table["value"].data[:] == ["cue on", "cue off"]) + assert all( + read_meanings_table["meaning"].data[:] + == ["Times when the cue was on screen.", "Times when the cue was off screen."] + ) -class TestEventTypesTable(TestCase): +class TestCategoricalVectorData(TestCase): def test_init(self): - event_types_table = EventTypesTable(description="Metadata about event types") - assert event_types_table.name == "EventTypesTable" - assert event_types_table.description == "Metadata about event types" - - def test_init_name(self): - event_types_table = EventTypesTable(name="event_types", description="Metadata about event types") - assert event_types_table.name == "event_types" - assert event_types_table.description == "Metadata about event types" - - def test_add_row(self): - event_types_table = EventTypesTable(description="Metadata about event types") - event_types_table.add_row( - event_name="cue on", - event_type_description="Times when the cue was on screen.", + meanings_table = MeaningsTable( + name="categorical_vector_data_meanings", + description="Meanings for values in a CategoricalVectorData object.", ) - event_types_table.add_row( - event_name="stimulus on", - event_type_description="Times when the stimulus was on screen.", + categorical_vector_data = CategoricalVectorData( + name="categorical_vector_data", description="description", data=["a", "b"], meanings=meanings_table ) - assert event_types_table["event_name"].data == ["cue on", "stimulus on"] - assert event_types_table["event_type_description"].data == [ - "Times when the cue was on screen.", - "Times when the stimulus was on screen.", - ] - - -class TestEventTypesTableSimpleRoundtrip(TestCase): - """Simple roundtrip test for EventTypesTable.""" - - def setUp(self): - self.path = "test.nwb" + assert categorical_vector_data.name == "categorical_vector_data" + assert categorical_vector_data.description == "description" + assert categorical_vector_data.data == ["a", "b"] + assert categorical_vector_data.meanings is meanings_table - def tearDown(self): - remove_test_file(self.path) - - def test_roundtrip(self): - """ - Create an EventTypesTable, write it to file, read the file, and test that the read table matches the original. - """ - # NOTE that when adding an EventTypesTable to a Task, the EventTypesTable - # must be named "event_types" according to the spec - event_types_table = EventTypesTable(name="event_types", description="Metadata about event types") - event_types_table.add_row( - event_name="cue on", - event_type_description="Times when the cue was on screen.", + def test_init_filter_values(self): + meanings_table = MeaningsTable( + name="categorical_vector_data_meanings", + description="Meanings for values in a CategoricalVectorData object.", ) - event_types_table.add_row( - event_name="stimulus on", - event_type_description="Times when the stimulus was on screen.", + categorical_vector_data = CategoricalVectorData( + name="categorical_vector_data", + description="description", + data=["a", "b", "undefined"], + meanings=meanings_table, + filter_values=["undefined"], ) - task = Task() - task.event_types = event_types_table - nwbfile = mock_NWBFile() - nwbfile.add_lab_meta_data(task) + assert categorical_vector_data.filter_values == ["undefined"] - with NWBHDF5IO(self.path, mode="w") as io: - io.write(nwbfile) - with NWBHDF5IO(self.path, mode="r", load_namespaces=True) as io: - read_nwbfile = io.read() - read_event_types_table = read_nwbfile.get_lab_meta_data("task").event_types - assert isinstance(read_event_types_table, EventTypesTable) - assert read_event_types_table.name == "event_types" - assert read_event_types_table.description == "Metadata about event types" - assert all(read_event_types_table["event_name"].data[:] == ["cue on", "stimulus on"]) - assert all( - read_event_types_table["event_type_description"].data[:] - == [ - "Times when the cue was on screen.", - "Times when the stimulus was on screen.", - ] - ) +# NOTE: A roundtrip test for CategoricalVectorData is bundled with the test for EventsTable +# because the CategoricalVectorData object is used in the EventsTable class. +# The MeaningsTable object should be placed in the EventsTable object. class TestEventsTable(TestCase): def test_init(self): - events_table = EventsTable(description="Metadata about events") - assert events_table.name == "EventsTable" + events_table = EventsTable(name="stimulus_events", description="Metadata about events") + assert events_table.name == "stimulus_events" assert events_table.description == "Metadata about events" - def test_init_dtr(self): - event_types_table = EventTypesTable(description="Metadata about event types") - event_types_table.add_row( - event_name="cue on", - event_type_description="Times when the cue was on screen.", - ) - event_types_table.add_row( - event_name="stimulus on", - event_type_description="Times when the stimulus was on screen.", - ) - - events_table = EventsTable(description="Metadata about events", target_tables={"event_type": event_types_table}) - assert events_table["event_type"].table is event_types_table - def test_add_row(self): - event_types_table = EventTypesTable(description="Metadata about event types") - event_types_table.add_row( - event_name="cue on", - event_type_description="Times when the cue was on screen.", - # hed_tags=["Sensory-event", "(Intended-effect, Cue)"], - ) - event_types_table.add_row( - event_name="stimulus on", - event_type_description="Times when the stimulus was on screen.", - # hed_tags=["Sensory-event", "Experimental-stimulus", "Visual-presentation", "Image", "Face"], + cue_meanings_table = MeaningsTable( + name="cue_type_meanings", description="Meanings for values in a CategoricalVectorData object." + ) + stimulus_meanings_table = MeaningsTable( + name="stimulus_type_meanings", description="Meanings for values in a CategoricalVectorData object." + ) + columns = [ + CategoricalVectorData( + name="cue_type", description="The cue type.", meanings=cue_meanings_table, filter_values=["n/a"] + ), + CategoricalVectorData( + name="stimulus_type", + description="The stimulus type.", + meanings=stimulus_meanings_table, + filter_values=["n/a"], + ), + ] + events_table = EventsTable( + name="stimulus_events", description="Metadata about stimulus events", columns=columns ) - - events_table = EventsTable(description="Metadata about events", target_tables={"event_type": event_types_table}) - events_table.add_column(name="cue_type", description="The cue type.") - events_table.add_column(name="stimulus_type", description="The stimulus type.") events_table.add_row( timestamp=0.1, - cue_type="white circle", - stimulus_type="", - event_type=0, duration=0.1, - value="", - # hed_tags=["(White, Circle)"], + cue_type="white circle", + stimulus_type="n/a", ) events_table.add_row( timestamp=0.3, - cue_type="", - stimulus_type="animal", - event_type=1, duration=0.15, - value="giraffe", + cue_type="n/a", + stimulus_type="animal", ) events_table.add_row( timestamp=1.1, - cue_type="green square", - stimulus_type="", - event_type=0, duration=0.1, - value="", - # hed_tags=["(Green, Square)"], + cue_type="green square", + stimulus_type="n/a", ) events_table.add_row( timestamp=1.3, - cue_type="", - stimulus_type="landscape", - event_type=1, duration=0.15, - value="farm", + cue_type="n/a", + stimulus_type="landscape", ) + cue_meanings_table.add_row(value="white circle", meaning="Times when the cue was a white circle.") + cue_meanings_table.add_row(value="green square", meaning="Times when the cue was a green square.") + stimulus_meanings_table.add_row(value="animal", meaning="Times when the stimulus was an animal.") + stimulus_meanings_table.add_row(value="landscape", meaning="Times when the stimulus was a landscape.") + + # events_table.add_meanings(cue_meanings_table) + # events_table.add_meanings(stimulus_meanings_table) + assert events_table["timestamp"].data == [0.1, 0.3, 1.1, 1.3] - assert events_table["cue_type"].data == ["white circle", "", "green square", ""] - assert events_table["stimulus_type"].data == ["", "animal", "", "landscape"] assert events_table["duration"].data == [0.1, 0.15, 0.1, 0.15] - assert events_table["event_type"].data == [0, 1, 0, 1] - assert events_table["value"].data == ["", "giraffe", "", "farm"] - # assert events_table["hed_tags"][0] == ["(White, Circle)"] - # assert events_table["hed_tags"][2] == ["(Green, Square)"] + assert events_table["cue_type"].data == ["white circle", "n/a", "green square", "n/a"] + assert events_table["stimulus_type"].data == ["n/a", "animal", "n/a", "landscape"] class TestEventsTableSimpleRoundtrip(TestCase): @@ -334,264 +291,86 @@ def test_roundtrip(self): """ # NOTE that when adding an EventTypesTable to a Task, the EventTypesTable # must be named "event_types" according to the spec - event_types_table = EventTypesTable(name="event_types", description="Metadata about event types") - event_types_table.add_row( - event_name="cue on", - event_type_description="Times when the cue was on screen.", - # hed_tags=["Sensory-event", "(Intended-effect, Cue)"], - ) - event_types_table.add_row( - event_name="stimulus on", - event_type_description="Times when the stimulus was on screen.", - # hed_tags=["Sensory-event", "Experimental-stimulus", "Visual-presentation", "Image", "Face"], + cue_meanings_table = MeaningsTable( + name="cue_type_meanings", description="Meanings for values in a CategoricalVectorData object." + ) + stimulus_meanings_table = MeaningsTable( + name="stimulus_type_meanings", description="Meanings for values in a CategoricalVectorData object." + ) + columns = [ + CategoricalVectorData( + name="cue_type", description="The cue type.", meanings=cue_meanings_table, filter_values=["n/a"] + ), + CategoricalVectorData( + name="stimulus_type", + description="The stimulus type.", + meanings=stimulus_meanings_table, + filter_values=["n/a"], + ), + ] + meanings_tables = [cue_meanings_table, stimulus_meanings_table] + events_table = EventsTable( + name="stimulus_events", + description="Metadata about stimulus events", + columns=columns, + meanings_tables=meanings_tables, ) - - events_table = EventsTable(description="Metadata about events", target_tables={"event_type": event_types_table}) - events_table.add_column(name="cue_type", description="The cue type.") - events_table.add_column(name="stimulus_type", description="The stimulus type.") events_table.add_row( timestamp=0.1, - cue_type="white circle", - stimulus_type="", - event_type=0, duration=0.1, - value="", - # hed_tags=["(White, Circle)"], + cue_type="white circle", + stimulus_type="n/a", ) events_table.add_row( timestamp=0.3, - cue_type="", - stimulus_type="animal", - event_type=1, duration=0.15, - value="giraffe", + cue_type="n/a", + stimulus_type="animal", ) events_table.add_row( timestamp=1.1, - cue_type="green square", - stimulus_type="", - event_type=0, duration=0.1, - value="", - # hed_tags=["(Green, Square)"], + cue_type="green square", + stimulus_type="n/a", ) events_table.add_row( timestamp=1.3, - cue_type="", - stimulus_type="landscape", - event_type=1, duration=0.15, - value="farm", + cue_type="n/a", + stimulus_type="landscape", ) + cue_meanings_table.add_row(value="white circle", meaning="Times when the cue was a white circle.") + cue_meanings_table.add_row(value="green square", meaning="Times when the cue was a green square.") + stimulus_meanings_table.add_row(value="animal", meaning="Times when the stimulus was an animal.") + stimulus_meanings_table.add_row(value="landscape", meaning="Times when the stimulus was a landscape.") - task = Task() - task.event_types = event_types_table - nwbfile = mock_NWBFile() - nwbfile.add_lab_meta_data(task) - nwbfile.add_acquisition(events_table) + nwbfile = NdxEventsNWBFile( + identifier="test", session_description="test", session_start_time=datetime.now().astimezone() + ) + nwbfile.add_events_table(events_table) with NWBHDF5IO(self.path, mode="w") as io: io.write(nwbfile) with NWBHDF5IO(self.path, mode="r", load_namespaces=True) as io: read_nwbfile = io.read() - read_event_types_table = read_nwbfile.get_lab_meta_data("task").event_types - read_events_table = read_nwbfile.acquisition["EventsTable"] + read_events_table = read_nwbfile.events["stimulus_events"] assert isinstance(read_events_table, EventsTable) - assert read_events_table.name == "EventsTable" - assert read_events_table.description == "Metadata about events" + assert read_events_table.description == "Metadata about stimulus events" assert all(read_events_table["timestamp"].data[:] == [0.1, 0.3, 1.1, 1.3]) - assert all(read_events_table["cue_type"].data[:] == ["white circle", "", "green square", ""]) - assert all(read_events_table["stimulus_type"].data[:] == ["", "animal", "", "landscape"]) assert all(read_events_table["duration"].data[:] == [0.1, 0.15, 0.1, 0.15]) - assert all(read_events_table["event_type"].data[:] == [0, 1, 0, 1]) - assert all(read_events_table["value"].data[:] == ["", "giraffe", "", "farm"]) - assert read_events_table["event_type"].table is read_event_types_table - - -class TestTtlTypesTable(TestCase): - def test_init(self): - ttl_types_table = TtlTypesTable(description="Metadata about TTL types") - assert ttl_types_table.name == "TtlTypesTable" - assert ttl_types_table.description == "Metadata about TTL types" - - def test_init_name(self): - ttl_types_table = TtlTypesTable(name="ttl_types", description="Metadata about TTL types") - assert ttl_types_table.name == "ttl_types" - assert ttl_types_table.description == "Metadata about TTL types" - def test_add_row(self): - ttl_types_table = TtlTypesTable(description="Metadata about TTL types") - ttl_types_table.add_row( - event_name="cue on", - event_type_description="Times when the cue was on screen.", - pulse_value=np.uint(1), - ) - ttl_types_table.add_row( - event_name="stimulus on", - event_type_description="Times when the stimulus was on screen.", - pulse_value=np.uint(2), - ) - assert ttl_types_table["event_name"].data == ["cue on", "stimulus on"] - assert ttl_types_table["event_type_description"].data == [ - "Times when the cue was on screen.", - "Times when the stimulus was on screen.", - ] - assert all(ttl_types_table["pulse_value"].data == np.uint([1, 2])) - - -class TestTtlTypesTableSimpleRoundtrip(TestCase): - """Simple roundtrip test for TtlTypesTable.""" - - def setUp(self): - self.path = "test.nwb" - - def tearDown(self): - remove_test_file(self.path) - - def test_roundtrip(self): - """ - Create an TtlTypesTable, write it to file, read the file, and test that the read table matches the original. - """ - # NOTE that when adding an TtlTypesTable to a Task, the TtlTypesTable - # must be named "ttl_types" according to the spec - ttl_types_table = TtlTypesTable(name="ttl_types", description="Metadata about TTL types") - ttl_types_table.add_row( - event_name="cue on", - event_type_description="Times when the cue was on screen.", - pulse_value=np.uint(1), - ) - ttl_types_table.add_row( - event_name="stimulus on", - event_type_description="Times when the stimulus was on screen.", - pulse_value=np.uint(2), - ) - task = Task() - task.ttl_types = ttl_types_table - nwbfile = mock_NWBFile() - nwbfile.add_lab_meta_data(task) - - with NWBHDF5IO(self.path, mode="w") as io: - io.write(nwbfile) - - with NWBHDF5IO(self.path, mode="r", load_namespaces=True) as io: - read_nwbfile = io.read() - read_ttl_types_table = read_nwbfile.get_lab_meta_data("task").ttl_types - assert isinstance(read_ttl_types_table, EventTypesTable) - assert read_ttl_types_table.name == "ttl_types" - assert read_ttl_types_table.description == "Metadata about TTL types" - assert all(read_ttl_types_table["event_name"].data[:] == ["cue on", "stimulus on"]) + read_cue_type = read_events_table["cue_type"] + assert isinstance(read_cue_type, CategoricalVectorData) + assert read_cue_type.description == "The cue type." + assert all(read_cue_type.data[:] == ["white circle", "n/a", "green square", "n/a"]) + assert isinstance(read_cue_type.meanings, MeaningsTable) + assert read_cue_type.meanings.name == "cue_type_meanings" + assert read_cue_type.meanings.description == "Meanings for values in a CategoricalVectorData object." + assert all(read_cue_type.meanings["value"].data[:] == ["white circle", "green square"]) assert all( - read_ttl_types_table["event_type_description"].data[:] - == [ - "Times when the cue was on screen.", - "Times when the stimulus was on screen.", - ] + read_cue_type.meanings["meaning"].data[:] + == ["Times when the cue was a white circle.", "Times when the cue was a green square."] ) - assert all(read_ttl_types_table["pulse_value"].data[:] == np.uint([1, 2])) - - -class TestTtlsTable(TestCase): - def test_init(self): - ttls_table = TtlsTable(description="Metadata about TTLs") - assert ttls_table.name == "TtlsTable" - assert ttls_table.description == "Metadata about TTLs" - - def test_init_dtr(self): - ttl_types_table = TtlTypesTable(description="Metadata about TTL types") - ttl_types_table.add_row( - event_name="cue on", - event_type_description="Times when the cue was on screen.", - pulse_value=np.uint(1), - ) - ttl_types_table.add_row( - event_name="stimulus on", - event_type_description="Times when the stimulus was on screen.", - pulse_value=np.uint(2), - ) - - ttls_table = TtlsTable(description="Metadata about TTLs", target_tables={"ttl_type": ttl_types_table}) - assert ttls_table["ttl_type"].table is ttl_types_table - - def test_add_row(self): - ttl_types_table = TtlTypesTable(description="Metadata about TTL types") - ttl_types_table.add_row( - event_name="cue on", - event_type_description="Times when the cue was on screen.", - pulse_value=np.uint(1), - ) - ttl_types_table.add_row( - event_name="stimulus on", - event_type_description="Times when the stimulus was on screen.", - pulse_value=np.uint(2), - ) - - ttls_table = TtlsTable(description="Metadata about TTLs", target_tables={"ttl_type": ttl_types_table}) - ttls_table.add_row( - timestamp=0.1, - ttl_type=0, - ) - ttls_table.add_row( - timestamp=1.1, - ttl_type=0, - ) - assert ttls_table["timestamp"].data == [0.1, 1.1] - assert ttls_table["ttl_type"].data == [0, 0] - - -class TestTtlsTableSimpleRoundtrip(TestCase): - """Simple roundtrip test for TtlsTable.""" - def setUp(self): - self.path = "test.nwb" - - def tearDown(self): - remove_test_file(self.path) - - def test_roundtrip(self): - """ - Create a TtlsTable, write it to file, read the file, and test that the read table matches the original. - """ - # NOTE that when adding an TtlTypesTable to a Task, the TtlTypesTable - # must be named "ttl_types" according to the spec - ttl_types_table = TtlTypesTable(name="ttl_types", description="Metadata about TTL types") - ttl_types_table.add_row( - event_name="cue on", - event_type_description="Times when the cue was on screen.", - pulse_value=np.uint(1), - ) - ttl_types_table.add_row( - event_name="stimulus on", - event_type_description="Times when the stimulus was on screen.", - pulse_value=np.uint(2), - ) - - ttls_table = TtlsTable(description="Metadata about TTLs", target_tables={"ttl_type": ttl_types_table}) - ttls_table.add_row( - timestamp=0.1, - ttl_type=0, - ) - ttls_table.add_row( - timestamp=1.1, - ttl_type=0, - ) - - task = Task() - task.ttl_types = ttl_types_table - nwbfile = mock_NWBFile() - nwbfile.add_lab_meta_data(task) - nwbfile.add_acquisition(ttls_table) - - with NWBHDF5IO(self.path, mode="w") as io: - io.write(nwbfile) - - with NWBHDF5IO(self.path, mode="r", load_namespaces=True) as io: - read_nwbfile = io.read() - read_ttl_types_table = read_nwbfile.get_lab_meta_data("task").ttl_types - read_ttls_table = read_nwbfile.acquisition["TtlsTable"] - assert isinstance(read_ttls_table, TtlsTable) - assert read_ttls_table.name == "TtlsTable" - assert read_ttls_table.description == "Metadata about TTLs" - assert all(read_ttls_table["timestamp"].data[:] == [0.1, 1.1]) - assert all(read_ttls_table["ttl_type"].data[:] == [0, 0]) - assert read_ttls_table["ttl_type"].table is read_ttl_types_table + assert all(read_events_table["stimulus_type"].data[:] == ["n/a", "animal", "n/a", "landscape"]) diff --git a/src/pynwb/tests/test_example_usage.py b/src/pynwb/tests/test_example_usage.py index 1518a19..ed4252a 100644 --- a/src/pynwb/tests/test_example_usage.py +++ b/src/pynwb/tests/test_example_usage.py @@ -1,239 +1,13 @@ -def test_example_usage1(): - from datetime import datetime - from ndx_events import EventsTable, EventTypesTable, TtlsTable, TtlTypesTable, Task - import numpy as np - from pynwb import NWBFile, NWBHDF5IO +"""Evaluate examples of how to use the ndx-events extension.""" - nwbfile = NWBFile( - session_description="session description", - identifier="cool_experiment_001", - session_start_time=datetime.now().astimezone(), - ) +import subprocess +from pathlib import Path - # in this experiment, TTL pulses were sent by the stimulus computer - # to signal important time markers during the experiment/trial, - # when the stimulus was placed on the screen and removed from the screen, - # when the question appeared, and the responses of the subject. - # ref: https://www.nature.com/articles/s41597-020-0415-9, DANDI:000004 +def test_example_usage_write_ttls_events(): + """Call examples/write_ttls_events.py and check that it runs without errors.""" + subprocess.run(["python", "examples/write_ttls_events.py"], check=True) - # NOTE that when adding an TtlTypesTable to a Task, the TtlTypesTable - # must be named "ttl_types" according to the spec - ttl_types_table = TtlTypesTable(name="ttl_types", description="Metadata about TTL types") - ttl_types_table.add_row( - event_name="start experiment", - event_type_description="Start of experiment", - pulse_value=np.uint(55), - ) - ttl_types_table.add_row( - event_name="stimulus onset", - event_type_description="Stimulus onset", - pulse_value=np.uint(1), - ) - ttl_types_table.add_row( - event_name="stimulus offset", - event_type_description="Stimulus offset", - pulse_value=np.uint(2), - ) - ttl_types_table.add_row( - event_name="question onset", - event_type_description="Question screen onset", - pulse_value=np.uint(3), - ) - learning_response_description = ( - "During the learning phase, subjects are instructed to respond to the following " - "question: 'Is this an animal?' in each trial. Responses are encoded as 'Yes, this " - "is an animal' (20) and 'No, this is not an animal' (21)." - ) - ttl_types_table.add_row( - event_name="yes response during learning", - event_type_description=learning_response_description, - pulse_value=np.uint(20), - ) - ttl_types_table.add_row( - event_name="no response during learning", - event_type_description=learning_response_description, - pulse_value=np.uint(21), - ) - recognition_response_description = ( - "During the recognition phase, subjects are instructed to respond to the following " - "question: 'Have you seen this image before?' in each trial. Responses are encoded " - "as: 31 (new, confident), 32 (new, probably), 33 (new, guess), 34 (old, guess), 35 " - "(old, probably), 36 (old, confident)." - ) - ttl_types_table.add_row( - event_name="(new, confident) response during recognition", - event_type_description=recognition_response_description, - pulse_value=np.uint(31), - ) - ttl_types_table.add_row( - event_name="(new, probably) response during recognition", - event_type_description=recognition_response_description, - pulse_value=np.uint(32), - ) - ttl_types_table.add_row( - event_name="(new, guess) response during recognition", - event_type_description=recognition_response_description, - pulse_value=np.uint(33), - ) - ttl_types_table.add_row( - event_name="(old, guess) response during recognition", - event_type_description=recognition_response_description, - pulse_value=np.uint(34), - ) - ttl_types_table.add_row( - event_name="(old, probably) response during recognition", - event_type_description=recognition_response_description, - pulse_value=np.uint(35), - ) - ttl_types_table.add_row( - event_name="(old, confident) response during recognition", - event_type_description=recognition_response_description, - pulse_value=np.uint(36), - ) - ttl_types_table.add_row( - event_name="end trial", - event_type_description="End of trial", - pulse_value=np.uint(6), - ) - ttl_types_table.add_row( - event_name="end experiment", - event_type_description="End of experiment", - pulse_value=np.uint(66), - ) - - ttls_table = TtlsTable(description="Metadata about TTLs", target_tables={"ttl_type": ttl_types_table}) - ttls_table.add_row( - timestamp=6820.092244, - ttl_type=0, # NOT the pulse value, but a row index into the ttl_types_table - ) - ttls_table.add_row( - timestamp=6821.208244, - ttl_type=1, - ) - ttls_table.add_row( - timestamp=6822.210644, - ttl_type=2, - ) - ttls_table.add_row( - timestamp=6822.711364, - ttl_type=3, - ) - ttls_table.add_row( - timestamp=6825.934244, - ttl_type=6, - ) - ttls_table.timestamp.resolution = 1 / 50000.0 # specify the resolution of the timestamps (optional) - - # if TTLs are recorded, then the events table should hold any non-TTL events - # recorded by the acquisition system - # OR the events table can hold more processed information than the TTLs table - # e.g., converting stimulus onset and offset into a single stimulus event with metadata. - # this may be redundant with information in the trials table if the task is - # structured into trials - - # NOTE that when adding an EventTypesTable to a Task, the EventTypesTable - # must be named "event_types" according to the spec - event_types_table = EventTypesTable(name="event_types", description="Metadata about event types") - event_types_table.add_row( - event_name="stimulus on", - event_type_description="Times when the stimulus was on screen", - ) - - events_table = EventsTable(description="Metadata about events", target_tables={"event_type": event_types_table}) - events_table.add_column(name="category_name", description="Name of the category of the stimulus") - events_table.add_column( - name="stimulus_image_index", description="Frame index of the stimulus image in the StimulusPresentation object" - ) - events_table.add_row( - timestamp=6821.208244, - category_name="smallAnimal", - stimulus_image_index=0, - event_type=0, - duration=1.0024, # this comes from the stimulus onset and offset TTLs - ) - events_table.add_row( - timestamp=6821.208244, - category_name="phones", - stimulus_image_index=1, - event_type=0, - duration=0.99484, - ) - events_table.timestamp.resolution = 1 / 50000.0 # specify the resolution of the timestamps (optional) - events_table.duration.resolution = 1 / 50000.0 # specify the resolution of the durations (optional) - - task = Task() - task.event_types = event_types_table - task.ttl_types = ttl_types_table - nwbfile.add_lab_meta_data(task) - nwbfile.add_acquisition(events_table) - nwbfile.add_acquisition(ttls_table) - - # write nwb file - filename = "test.nwb" - with NWBHDF5IO(filename, "w") as io: - io.write(nwbfile) - - # read nwb file and check its contents - with NWBHDF5IO(filename, "r", load_namespaces=True) as io: - read_nwbfile = io.read() - print(read_nwbfile) - # access the events table, ttls table, event types table, and ttl types table and print them - print(read_nwbfile.get_lab_meta_data("task").event_types.to_dataframe()) - print(read_nwbfile.acquisition["EventsTable"].to_dataframe()) - print(read_nwbfile.get_lab_meta_data("task").ttl_types.to_dataframe()) - print(read_nwbfile.acquisition["TtlsTable"].to_dataframe()) - - -def test_example_usage2(): - """Example storing lick times""" - from datetime import datetime - from ndx_events import EventsTable, EventTypesTable, Task - import numpy as np - from pynwb import NWBFile, NWBHDF5IO - - nwbfile = NWBFile( - session_description="session description", - identifier="cool_experiment_001", - session_start_time=datetime.now().astimezone(), - ) - - # NOTE that when adding an EventTypesTable to a Task, the EventTypesTable - # must be named "event_types" according to the spec - event_types_table = EventTypesTable(name="event_types", description="Metadata about event types") - event_types_table.add_row( - event_name="lick", - event_type_description="Times when the subject licked the port", - ) - - # create a random sorted array of 1000 lick timestamps (dtype=float) from 0 to 3600 seconds - lick_times = sorted(np.random.uniform(0, 3600, 1000)) - - events_table = EventsTable(description="Metadata about events", target_tables={"event_type": event_types_table}) - for t in lick_times: - # event_type=0 corresponds to the first row in the event_types_table - events_table.add_row(timestamp=t, event_type=0) - events_table.timestamp.resolution = 1 / 30000.0 # licks were detected at 30 kHz - - task = Task() - task.event_types = event_types_table - nwbfile.add_lab_meta_data(task) - nwbfile.add_acquisition(events_table) - - # write nwb file - filename = "test.nwb" - with NWBHDF5IO(filename, "w") as io: - io.write(nwbfile) - - # read nwb file and check its contents - with NWBHDF5IO(filename, "r", load_namespaces=True) as io: - read_nwbfile = io.read() - print(read_nwbfile) - # access the events table and event types table and print them - print(read_nwbfile.get_lab_meta_data("task").event_types.to_dataframe()) - print(read_nwbfile.acquisition["EventsTable"].to_dataframe()) - - -if __name__ == "__main__": - test_example_usage1() - test_example_usage2() + # Remove the generated test_events.nwb if it exists + if Path("test_events.nwb").exists(): + Path("test_events.nwb").unlink() diff --git a/src/spec/create_extension_spec.py b/src/spec/create_extension_spec.py index 483920a..6fc0a79 100644 --- a/src/spec/create_extension_spec.py +++ b/src/spec/create_extension_spec.py @@ -1,13 +1,13 @@ # -*- coding: utf-8 -*- import os.path -from pynwb.spec import NWBNamespaceBuilder, export_spec, NWBGroupSpec, NWBAttributeSpec, NWBDatasetSpec +from pynwb.spec import NWBNamespaceBuilder, export_spec, NWBGroupSpec, NWBAttributeSpec, NWBDatasetSpec, NWBRefSpec def main(): ns_builder = NWBNamespaceBuilder( doc="""NWB extension for storing timestamped event and TTL pulse data""", name="""ndx-events""", - version="""0.3.0""", + version="""0.4.0""", author=["Ryan Ly"], contact=["rly@lbl.gov"], ) @@ -28,9 +28,7 @@ def main(): doc="The unit of measurement for the timestamps, fixed to 'seconds'.", value="seconds", ), - # NOTE: this requires all timestamps to have the same resolution which may not be true - # if they come from different acquisition systems or processing pipelines... - # maybe this should be a column of the event type table instead? + # NOTE: alternatively, this could be an attribute of EventsTable instead NWBAttributeSpec( name="resolution", dtype="float", @@ -57,7 +55,7 @@ def main(): doc="The unit of measurement for the durations, fixed to 'seconds'.", value="seconds", ), - # NOTE: this is usually the same as the timestamp resolution + # NOTE: this is probably always the same as the timestamp resolution NWBAttributeSpec( name="resolution", dtype="float", @@ -70,23 +68,63 @@ def main(): ], ) - event_types_table = NWBGroupSpec( - neurodata_type_def="EventTypesTable", + meanings_table = NWBGroupSpec( + neurodata_type_def="MeaningsTable", neurodata_type_inc="DynamicTable", - doc="A column-based table to store information about each event type, such as name, one event type per row.", - default_name="EventTypesTable", + doc=( + "A table to store information about the meanings of categorical data. Intended to be used as a " + "lookup table for the meanings of values in a CategoricalVectorData object. All possible values of " + "the parent CategoricalVectorData object should be present in the 'value' column of this table, even " + "if the value is not observed in the data. Additional columns may be added to store additional metadata " + "about each value." + ), datasets=[ NWBDatasetSpec( - name="event_name", + name="value", neurodata_type_inc="VectorData", - dtype="text", - doc="Name of each event type.", + doc="The value of the parent CategoricalVectorData object.", ), NWBDatasetSpec( - name="event_type_description", + name="meaning", neurodata_type_inc="VectorData", dtype="text", - doc="Description of each event type.", + doc="The meaning of the value in the parent CategoricalVectorData object.", + ), + ], + ) + + categorical_vector_data = NWBDatasetSpec( + neurodata_type_def="CategoricalVectorData", + neurodata_type_inc="VectorData", + doc="A 1-dimensional VectorData that stores categorical data of any type. This is an experimental type.", + dims=["num_events"], + shape=[None], + attributes=[ + NWBAttributeSpec( + # object reference to the meanings table because datasets cannot contain groups + name="meanings", + dtype=NWBRefSpec( + target_type="MeaningsTable", + reftype="object", + ), + doc=( + "The MeaningsTable object that provides the meanings of the values in this " + "CategoricalVectorData object." + ), + ), + NWBAttributeSpec( + name="filter_values", + doc=( + "Optional dataset containing possible values in the parent data that represent missing or " + "invalid values that should be filtered out during analysis. Currently, only string values are " + "allowed. " + 'For example, the filter values may contain the values "undefined" or "None" ' + "to signal that those values in the data are missing or invalid." + ), + dtype="text", # NOTE: a dtype is required for attributes! + dims=["num_events"], + shape=[None], + required=False, ), ], ) @@ -96,106 +134,72 @@ def main(): neurodata_type_inc="DynamicTable", doc=( "A column-based table to store information about events (event instances), one event per row. " - "Each event must have an event_type, which is a reference to a row in the EventTypesTable. " "Additional columns may be added to store metadata about each event, such as the duration " - "of the event, or a text value of the event." + "of the event." ), - # NOTE: custom columns should apply to every event in the table which may not be the case - default_name="EventsTable", datasets=[ NWBDatasetSpec( name="timestamp", neurodata_type_inc="TimestampVectorData", - doc="The time that each event occurred, in seconds, from the session start time.", - ), - NWBDatasetSpec( - name="event_type", - neurodata_type_inc="DynamicTableRegion", - dims=["num_events"], - shape=[None], - doc=( - "The type of event that occurred. This is represented as a reference " - "to a row of the EventTypesTable." - ), - quantity="?", + doc="Column containing the time that each event occurred, in seconds, from the session start time.", ), NWBDatasetSpec( name="duration", neurodata_type_inc="DurationVectorData", - doc="Optional column containing the duration of each event, in seconds.", - quantity="?", - ), - NWBDatasetSpec( - name="value", - neurodata_type_inc="VectorData", doc=( - "Optional column containing a value/parameter associated with each event. " - "For example, if you have three levels of reward (e.g., 1 drop, 2 drops, " - "3 drops), instead of encoding each level of reward as its own event " - "type (e.g., 'reward_value_1', 'reward_value_2', 'reward_value_3', " - "you could encode 'reward' as the event type, and the value for each " - "event time could be 1, 2, or 3." + "Optional column containing the duration of each event, in seconds. " + "A value of NaN can be used for events without a duration or with a duration that is not yet " + "specified." ), quantity="?", ), ], - ) - - ttl_types_table = NWBGroupSpec( - neurodata_type_def="TtlTypesTable", - neurodata_type_inc="EventTypesTable", - doc=( - "A column-based table to store information about each TTL type, such as name and pulse value, " - "one TTL type per row." - ), - default_name="TtlTypesTable", - datasets=[ - NWBDatasetSpec( - name="pulse_value", - neurodata_type_inc="VectorData", - dtype="uint8", - doc="TTL pulse value for each event type.", + groups=[ + # NOTE: the EventsTable will automatically become a MultiContainerInterface, so adjust the auto-generated + # class in the extension + NWBGroupSpec( + neurodata_type_inc="MeaningsTable", + doc=( + "Lookup tables for the meanings of the values in any CategoricalVectorData columns. " + "The name of the table should be the name of the corresponding CategoricalVectorData column " + 'followed by "_meanings".' + ), + quantity="*", ), ], - ) - - ttls_table = NWBGroupSpec( - neurodata_type_def="TtlsTable", - neurodata_type_inc="EventsTable", - doc="Data type to hold timestamps of TTL pulses.", - default_name="TtlsTable", - datasets=[ - NWBDatasetSpec( - name="ttl_type", - neurodata_type_inc="DynamicTableRegion", - dims=["num_events"], - shape=[None], - doc="The type of TTL that occurred. This is represented as a reference to a row of the TtlTypesTable.", + attributes=[ + NWBAttributeSpec( + name="description", + dtype="text", + doc=( + "A description of the events stored in the table, including information about " + "how the event times were computed, especially if the times are the result of processing or " + "filtering raw data. For example, if the experimenter is encoding different types of events using " + "a strobed or N-bit encoding, then the description should describe which channels were used and " + "how the event time is computed, e.g., as the rise time of the first bit." + ), ), ], ) - task = NWBGroupSpec( - neurodata_type_def="Task", - neurodata_type_inc="LabMetaData", + ndx_events_nwb_file = NWBGroupSpec( + neurodata_type_def="NdxEventsNWBFile", + neurodata_type_inc="NWBFile", doc=( - "A group to store task-related general metadata. TODO When merged with core, " - "this will no longer inherit from LabMetaData but from NWBContainer and be placed " - "optionally in /general." + "An extension to the NWBFile to store event data. After integration of ndx-events with the core schema, " + "the NWBFile schema should be updated to this type." ), - name="task", groups=[ NWBGroupSpec( - name="event_types", - neurodata_type_inc="EventTypesTable", - doc="Table to store information about each task event type.", - quantity="?", - ), - NWBGroupSpec( - name="ttl_types", - neurodata_type_inc="TtlTypesTable", - doc="Table to store information about each task TTL type.", - quantity="?", + name="events", + doc="Events that occurred during the session.", + groups=[ + NWBGroupSpec( + neurodata_type_inc="EventsTable", + doc="Events that occurred during the session.", + quantity="*", + ), + ], ), ], ) @@ -203,11 +207,10 @@ def main(): new_data_types = [ timestamp_vector_data, duration_vector_data, - event_types_table, + meanings_table, + categorical_vector_data, events_table, - ttl_types_table, - ttls_table, - task, + ndx_events_nwb_file, ] # export the spec to yaml files in the spec folder