Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add experimental CV-CUDA resize #5637

Merged
merged 2 commits into from
Sep 30, 2024

Conversation

banasraf
Copy link
Collaborator

@banasraf banasraf commented Sep 18, 2024

Category:

New feature

Description:

This PR adds new resize operator that uses CV-CUDA HQResize as its implementation.
It uses existing infrastructure of the resize operators to handle arguments and the cv-cuda op more or less replaces DALI kernel.

The testing involves extending existing tests to run the new operator in the same way the CPU and GPU versions are run.

Additional information:

Affected modules and functionalities:

New operator, small changes in ResizeBase

Key points relevant for the review:

Tests:

  • Existing tests apply
  • New tests added
    • Python tests
    • GTests
    • Benchmark
    • Other
  • N/A

Checklist

Documentation

  • Existing documentation applies
  • Documentation updated
    • Docstring
    • Doxygen
    • RST
    • Jupyter
    • Other
  • N/A

DALI team only

Requirements

  • Implements new requirements
  • Affects existing requirements
  • N/A

REQ IDs: N/A

JIRA TASK: N/A

@banasraf
Copy link
Collaborator Author

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [18508010]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [18508010]: BUILD FAILED

@banasraf banasraf force-pushed the add-experimental-cvcuda-resize branch from 20da032 to be71021 Compare September 18, 2024 15:24
@banasraf
Copy link
Collaborator Author

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [18511395]: BUILD STARTED

@banasraf banasraf force-pushed the add-experimental-cvcuda-resize branch from be71021 to 571b143 Compare September 18, 2024 15:32
@banasraf
Copy link
Collaborator Author

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [18511556]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [18511556]: BUILD FAILED

@banasraf banasraf force-pushed the add-experimental-cvcuda-resize branch from 571b143 to 1a6852b Compare September 18, 2024 16:05
@banasraf
Copy link
Collaborator Author

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [18512606]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [18512606]: BUILD FAILED

@banasraf banasraf force-pushed the add-experimental-cvcuda-resize branch from 1a6852b to 826106d Compare September 18, 2024 17:35
@banasraf
Copy link
Collaborator Author

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [18515034]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [18515034]: BUILD FAILED

@banasraf banasraf force-pushed the add-experimental-cvcuda-resize branch from 826106d to 889c104 Compare September 19, 2024 07:27
@banasraf
Copy link
Collaborator Author

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [18537144]: BUILD STARTED

Signed-off-by: Rafal Banas <rbanas@nvidia.com>
@banasraf banasraf force-pushed the add-experimental-cvcuda-resize branch from 889c104 to c40fd3b Compare September 19, 2024 09:17
@banasraf
Copy link
Collaborator Author

!build

@banasraf banasraf marked this pull request as ready for review September 19, 2024 09:17
@dali-automaton
Copy link
Collaborator

CI MESSAGE: [18539715]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [18537144]: BUILD FAILED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [18539715]: BUILD PASSED

Comment on lines 54 to 56
if (HasEmptySamples(in_shape_)) {
curr_minibatch_size_ = 1;
}
Copy link
Contributor

@mzient mzient Sep 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? Can't we just skip them when assembling the input/output TensorBatch?
We could do any of the following - and perhaps more:

  1. store indices for each minibatch explicitlly instead of just storing the range - probably the easiest to implement;
  2. have an global index array with only non-empty samples (each entry pointing to the original sample/index) - likely the most efficient option, also not hard to implement
  3. we could even assemble an effective input/output TensorList and work with that (most expensive).

All options are better than giving up batched processing.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Empty samples in the input are a very much edge case, I assume, so it wouldn't affect regular execution. Nevertheless, I implemented the solution with global index array

Signed-off-by: Rafal Banas <rbanas@nvidia.com>
@banasraf
Copy link
Collaborator Author

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [18651120]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [18651120]: BUILD PASSED

Copy link
Contributor

@awolant awolant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it works and is tested

Copy link
Member

@szalpal szalpal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it would be good to add explicit (and extensive) info, how does fn.experimental.resize differ from fn.resize. I know user can read CV-CUDA docs and then fn.resize docs, but putting the difference in fn.experimental.resize I believe is highly accurate. Plus, I don't think there's anywhere good explanation, how does DALI resize differ from HQResize.

@szalpal szalpal self-assigned this Sep 29, 2024
@banasraf banasraf merged commit ebb0974 into NVIDIA:main Sep 30, 2024
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants