Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Funlib persistence update #322

Merged
merged 21 commits into from
Nov 6, 2024
Merged

Conversation

pattonw
Copy link
Contributor

@pattonw pattonw commented Nov 5, 2024

Upgrade to funlib.persistence 0.5.

This update makes a one big improvement:
Custom Array class no longer needed. We used this mostly just to apply preprocessing lazily to large arrays. New funlib Array class uses dask internally which comes with much better support for lazy array operations than we built for ourselves. The ZarrArray and NumpyArray class which were used extensively throughout DaCapo have now been replaced with simple funlib.persistence.Arrays.

A minor incompatibility:
funlib.persistence.Array has a convention (for now) that all axes have names, but non-spatial axes have a "^" in their name. This will be fixed in the near future. For now, DaCapo convention needed to change a little bit to adapt to this. We now have to use "c^" and "b^" for channel and batch dimensions instead of just "c" and "b".

TODOs:
This pull request is not quire ready to merge. I pass the tests run with pytest, and the minimal_tutorial notebook executes. But there is a lot of code that is not tested. Specifically many of the ArrayConfig subclasses are not yet tested and some are missing implementations.

Here are the Preprocessing array configs, whether or not their implementation is complete, and their code coverage:

  • BinarizeArrayConfig 96%
  • ConcatArrayConfig 60%
  • ConstantArrayConfig 57%
  • CropArrayConfig 69%
  • DummyArrayConfig 91%
  • DVIDArrayConfig 90% (misleading, only skeleton implementation so not much to test)
  • IntensitiesArrayConfig 75%
  • LogicalOrArrayConfig 60%
  • MergeInstancesArrayConfig 100% (misleading, no implementation so nothing to test)
  • MissingAnnotationsMaskConfig 100% (misleading)
  • OnesArrayConfig 100% (misleading)
  • ResampledArrayConfig 100% (misleading)
  • SumArrayConfig 100% (misleading)
  • TiffArrayConfig 0%
  • ZarrArrayConfig 70%

Best practice would be to add tests before merging, but I want to put this here so others can test it

@pattonw
Copy link
Contributor Author

pattonw commented Nov 5, 2024

Added support for most of the remaining preprocessing operations.
Exceptions are:

  1. DVID arrays, there is no numpy like interface that I know of for DVID arrays. You have to access data via special getter methods. This means we probably need a custom handler to turn it into something numpy array like.
  2. Resampled arrays. I'm pretty confident this can be done purely with dask, I just don't know how yet.

@mzouink
Copy link
Member

mzouink commented Nov 6, 2024

i tried to run minimal example:
got error on :

cell_array = prepare_ds(
"cells3d.zarr",
"raw",
Roi((0, 0, 0), cell_data.shape[1:]) * voxel_size,
voxel_size=voxel_size,
dtype=np.uint8,
num_channels=None,
)

because it doesn't fit to the new prepare_ds function
https://github.com/funkelab/funlib.persistence/blob/3c0760e48edf1b287c4f75d7d11dc6b775332b2b/funlib/persistence/arrays/datasets.py#L121-L132
is there anyway to have a wrapper function (can have deprecated flag) that can support the old structure of the function. will be easier than looking for all the use of of prepare_ds and change them

@pattonw
Copy link
Contributor Author

pattonw commented Nov 6, 2024

Oh sorry, I only fixed the version in docs/source/notebooks/...
I can fix the version in examples as well

I would recommend against a wrapper. I think it would be confusing to have two different versions of the same function that behave differently

@mzouink mzouink merged commit aeb77a6 into janelia-cellmap:main Nov 6, 2024
3 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants