Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dimension links #469

Merged
merged 29 commits into from
Jul 20, 2020
Merged

Dimension links #469

merged 29 commits into from
Jul 20, 2020

Conversation

achilleas-k
Copy link
Member

Implements DimensionLink as discussed in G-Node/nix#824 (closes #466).

Some notes and points for discussion:

  • The name of the link in the backend (inside the HDF5 file) is simply link. Therefore, a dimension with a linked object has the following path:
    /data/<block name>/data_arrays/<array name>/dimensions/<dim number>/link/<linked object id>
  • The dimension link object holds an attribute that specifies the type of the object it's linking. The value of the attribute is either "DataArray" or "DataFrame".
  • The dimension module has no access to the DataFrame and DataArray classes so it can't return or initialise an object of those types. We could do it for the DataFrame, but importing DataArray would result in an import cycle. DataArrays need to be able to create Dimensions, so Dimensions can't create (initialise) DataArrays.
  • Do we want to allow setting values or attributes through the dimension link to the underlying data object? For instance, should the ticks or label of a RangeDimension that links to a DataArray be modifiable through the RangeDimension object?
    • I haven't added support for this, but depending on the answer, I'll have to either add support for it or raise an exception.
  • We don't allow dimensions that link to a DataFrame to have a None index, for linking to the entire DataFrame. We initially supported this in the DataFrameDimension implementation, but now this ability or an analogous one isn't supported. I think we agreed on this in our discussion, but I wanted to make sure I'm remembering correctly.
  • Users are allowed to link a DataArray to a dimension with an index that is out of bounds for the array, as long as the dimensionality is correct. My thinking here was that the underlying (linked) DataArray can be resized to fit or break the indexing, so the error will be caught on data retrieval or validation. The dimensionality of a DA can't change, so we check for this during linking.

- Replace alias range dimension test with self-link test.
- Test RangeDimension -> DataArray link
Deprecation warning

  FutureWarning: Using a non-tuple sequence for multidimensional
  indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`.
  In the future this will be interpreted as an array index,
  `arr[np.array(seq)]`, which will result either in an error or a
  different result.
For both Range and Set dimensions
Using group redirect method is unnecessary for SampledDimensions since
they don't support linking
Return unit directly for DataArray links.
Return unit of indexed column for DataFrame links.
Return label directly for DataArray links.
Return name of indexed column for DataFrame links.
- Removed DataFrameDimension class.
- Removed all mentions of alias range dimension from DataArray.
- Removed associated tests and test sections.
Should not be negative either.
Should not have negative values except one -1.
@lgtm-com
Copy link

lgtm-com bot commented Jul 15, 2020

This pull request introduces 1 alert when merging f1aeb40 into 2e6491f - view on LGTM.com

new alerts:

  • 1 for Unused import

@lgtm-com
Copy link

lgtm-com bot commented Jul 16, 2020

This pull request introduces 1 alert when merging 7677c6e into 2e6491f - view on LGTM.com

new alerts:

  • 1 for Unused import

jgrewe
jgrewe previously approved these changes Jul 19, 2020
Copy link
Member

@jgrewe jgrewe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks very nice to me!
one point:

  • we do have the same need for such linking on the feature and I wonder, whether some of the DimensionLink class functionality should go into a common superclass for DimensionLinks and Feature Links, the first needs the index, the latter the link_type (tagged, untagged, indexed)

did not think about it but do we need the invalidUnit import in DataArray?

@achilleas-k
Copy link
Member Author

this looks very nice to me!
one point:

  • we do have the same need for such linking on the feature and I wonder, whether some of the DimensionLink class functionality should go into a common superclass for DimensionLinks and Feature Links, the first needs the index, the latter the link_type (tagged, untagged, indexed)

Good idea. Some low level functionality could go there, like all the underlying attribute access methods. It might also fix the import cycle issue and let us instantiate the data object through the dimension link, but I'll have to think about it more and then try it.

I'd like to do this in a followup PR though.

did not think about it but do we need the invalidUnit import in DataArray?

We used to need it for the AliasRangeDimension, since having a non SI unit in a dimension wasn't supported at some point. I think we allow it now, just not scaling.

Anyway, cleaned out all Unused vars and imports now.

@lgtm-com
Copy link

lgtm-com bot commented Jul 19, 2020

This pull request introduces 1 alert when merging 8d84599 into 2e6491f - view on LGTM.com

new alerts:

  • 1 for Suspicious unused loop iteration variable

Can now write through to the linked data object when setting label and
unit on RangeDimension:
- RangeDimension ticks cannot be changed.
- RangeDimension label linked to DataArray changes the DataArray label.
- RangeDimension label linked to DataFrame raises error: The column name
  of a DataFrame can't be changed since it's the name of a field of
  a compound data type, which we can't change.
- RangeDimension unit linked to DataArray changes the DataArray unit.
- RangeDimension unit linked to DataFrame changes the DataFrame column
  unit.
- SetDimension labels cannot be changed.
pylint warning codes
  W0611: Unused import
  W0612: Unused variable
@achilleas-k achilleas-k force-pushed the feat/dimension-links branch 2 times, most recently from d538ed8 to e1558d6 Compare July 20, 2020 16:18
@jgrewe jgrewe merged commit 5e9ab0b into G-Node:master Jul 20, 2020
@achilleas-k achilleas-k deleted the feat/dimension-links branch July 20, 2020 21:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Dimensions with linking for ticks and labels
2 participants