Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update dev/main #276

Open
wants to merge 236 commits into
base: dev/main
Choose a base branch
from
Open

Update dev/main #276

wants to merge 236 commits into from

Conversation

rhoadesScholar
Copy link
Member

No description provided.

rhoadesScholar and others added 30 commits February 11, 2024 12:53
Let's specify resolution also directly like `(8, 8, 8)`, in addition to
`Coordinate(8, 8, 8)`?
```python
datasplit_config = DataSplitGenerator.generate_from_csv(
    'test.csv',
    input_resolution=(8, 8, 8),  # This works.
    output_resolution=Coordinate(4, 4, 4),  # And this works.
)
```
There appear to be some python formatting errors in
330365b. This pull request
uses the [psf/black](https://github.com/psf/black) formatter to fix
these issues.
previously file stat writing would overwrite existing stats; this fix prevents that by appending the new stats
pattonw and others added 30 commits October 28, 2024 16:00
This PR adds an optional `augmentation_probability: float = 1.` argument
to `ElasticAugment`, `IntensityAugment`, `SimpleAugment`.
fixing batch dim bugs (batch norm requires batch dimension even in predict mode)
Seems to also fix the strange loss spike. I think it was due to setting model into eval mode and then not resetting to training at the end
Exceptions DVID and Resampled arrays
Upgrade to funlib.persistence `0.5`.

This update makes a one big improvement:
Custom `Array` class no longer needed. We used this mostly just to apply
preprocessing lazily to large arrays. New `funlib` `Array` class uses
`dask` internally which comes with much better support for lazy array
operations than we built for ourselves. The `ZarrArray` and `NumpyArray`
class which were used extensively throughout `DaCapo` have now been
replaced with simple `funlib.persistence.Array`s.

A minor incompatibility:
`funlib.persistence.Array` has a convention (for now) that all axes have
names, but non-spatial axes have a "^" in their name. This will be fixed
in the near future. For now, DaCapo convention needed to change a little
bit to adapt to this. We now have to use "c^" and "b^" for channel and
batch dimensions instead of just "c" and "b".

TODOs:
This pull request is not quire ready to merge. I pass the tests run with
`pytest`, and the `minimal_tutorial` notebook executes. But there is a
lot of code that is not tested. Specifically many of the `ArrayConfig`
subclasses are not yet tested and some are missing implementations.

Here are the Preprocessing array configs, whether or not their
implementation is complete, and their code coverage:
- [X] BinarizeArrayConfig 96%
- [X] ConcatArrayConfig 60%
- [X] ConstantArrayConfig 57%
- [X] CropArrayConfig 69%
- [X] DummyArrayConfig 91%
- [ ] DVIDArrayConfig 90% (misleading, only skeleton implementation so
not much to test)
- [X] IntensitiesArrayConfig 75%
- [X] LogicalOrArrayConfig 60%
- [x] MergeInstancesArrayConfig 100% (misleading, no implementation so
nothing to test)
- [x] MissingAnnotationsMaskConfig 100% (misleading)
- [x] OnesArrayConfig 100% (misleading)
- [ ] ResampledArrayConfig 100% (misleading)
- [x] SumArrayConfig 100% (misleading)
- [x] TiffArrayConfig 0%
- [X] ZarrArrayConfig 70%

Best practice would be to add tests before merging, but I want to put
this here so others can test it
There appear to be some python formatting errors in
aeb77a6. This pull request
uses the [psf/black](https://github.com/psf/black) formatter to fix
these issues.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants