Skip to content
This repository has been archived by the owner on Oct 9, 2023. It is now read-only.

Load Numpy arrays of incompatible dtypes #1507

Draft
wants to merge 17 commits into
base: master
Choose a base branch
from

Conversation

souravraha
Copy link

@souravraha souravraha commented Jan 6, 2023

What does this PR do?

Handles the loading of .npy images of integer data-types better. Previously, the ndarrays would be unsafely cast into 'uint8'. If the original dtype were, say 'int64', the loaded PIL Image would be different from what was intended.

Now it should throw up a warning whenever there is unsafe casting. In addition, it would attempt to load a float ndarray (instead of possibly an integer array). The latter is needed as PIL can readily import from float64 ndarrays (but not, say int64 ndarrays).

Before submitting

  • Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure your PR does only one thing, instead of bundling different changes together?
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests? [not needed for typos/docs]
  • Did you verify new and existing tests pass locally with your changes?
  • If you made a notable change (that affects users), did you update the CHANGELOG?

PR review

  • Is this pull request ready for review? (if not, please submit in draft mode)

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

Did you have fun?

Make sure you had fun coding 🙃

@souravraha souravraha changed the title Numpy Load Numpy arrays of incompatible dtypes. Jan 6, 2023
@Borda Borda changed the title Load Numpy arrays of incompatible dtypes. Load Numpy arrays of incompatible dtypes Jan 6, 2023
@Borda Borda added the bug / fix Something isn't working label Jan 6, 2023
train_annotations.json Outdated Show resolved Hide resolved
@souravraha
Copy link
Author

The old contribution is discarded, as it was failing a required test.

Now, the pixel values are scaled linearly to lie in between 0 and 255. This ensures minimal losses while loading an ndarray of dtype int64 to a PIL image. The downside for this is that every ndarray would get scaled differently, i.e. this is a sample specific scaling (rather than, say a global scaling for samples of the entire dataset).

flash/core/data/utilities/loading.py Outdated Show resolved Hide resolved
@codecov
Copy link

codecov bot commented May 12, 2023

Codecov Report

Merging #1507 (a093de1) into master (8ad3a96) will decrease coverage by 3%.
The diff coverage is 60%.

Additional details and impacted files
@@           Coverage Diff           @@
##           master   #1507    +/-   ##
=======================================
- Coverage      85%     83%    -3%     
=======================================
  Files         291     291            
  Lines       12852   12856     +4     
=======================================
- Hits        10982   10609   -373     
- Misses       1870    2247   +377     

@mergify mergify bot removed the has conflicts label May 29, 2023
@Borda Borda marked this pull request as draft June 20, 2023 10:07
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug / fix Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants