-
-
Notifications
You must be signed in to change notification settings - Fork 304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(fix): structured arrays for v2 #2681
(fix): structured arrays for v2 #2681
Conversation
Hmm, this will need to handle the case where the array is not given the |
(P.S I used |
# In the case of zarr v2, the simplest i.e., '|VXX' dtype is represented as a string | ||
dtype_descr = self.dtype.descr | ||
if self.dtype.kind == "V" and dtype_descr[0][0] != "" and len(dtype_descr) != 0: | ||
dtype_json = tuple(self.dtype.descr) | ||
else: | ||
dtype_json = self.dtype.str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is my attempt to match the old behavior. I didn't look back at the old code yet, but if someone knows this to be wrong, would be great to know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks right to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am keen to see this go in
@@ -220,6 +227,8 @@ def update_attributes(self, attributes: dict[str, JSON]) -> Self: | |||
|
|||
|
|||
def parse_dtype(data: npt.DTypeLike) -> np.dtype[Any]: | |||
if isinstance(data, list): # this is a valid _VoidDTypeLike check |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any iterable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to handle the [(field_name, field_dtype, field_shape), ...]
case on https://numpy.org/doc/2.1/reference/arrays.dtypes.html#specifying-and-constructing-data-types but at the same time to obey
This might require more stringent checking or tests...Not sure. The reason this tuple
conversion happens is that lists (as data types) incoming from on-disk reads contain lists, not tuples. So maybe we should check list
and data[0]
is also list? And throw an error if it isn't? I'm not sure what else could be in the lists though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the dtype constructor would make an exception (or our own comprehension fails) in the case the JSON on disk was edited - so I'm not too worried.
…ython into ig/structured_arrays_v2
@@ -220,6 +227,8 @@ def update_attributes(self, attributes: dict[str, JSON]) -> Self: | |||
|
|||
|
|||
def parse_dtype(data: npt.DTypeLike) -> np.dtype[Any]: | |||
if isinstance(data, list): # this is a valid _VoidDTypeLike check |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the dtype constructor would make an exception (or our own comprehension fails) in the case the JSON on disk was edited - so I'm not too worried.
ping @d-v-b - I'd be happy to merge if you have no objections. |
I think this looks good, but I'm admittedly not a structured dtype user, so I can't give it a very close examination. I think the only thing remaining is to use the new changelog system added by #2736 |
I moved the changelog line to where it should be and will merge this when green, and leave a note on the towncrier PR thread saying that this will need dealing with (that PR is not yer merged). |
great, thanks for working on this @ilan-gold and @martindurant |
This is a best guess based on https://numpy.org/doc/2.1/reference/generated/numpy.dtype.kind.html and the fact that
VLenBytes
appears to be explicitly for strings.This addresses the v2 case of #2134
TODO: