-
Notifications
You must be signed in to change notification settings - Fork 622
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix inverted mmap inside webdataset reader #5683
Conversation
- fix the wrong ussage of mmap when the user is not asking for it inside webdataset - cleans up usage of copy_read_data_ Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
!build |
CI MESSAGE: [19486724]: BUILD STARTED |
CI MESSAGE: [19486724]: BUILD PASSED |
@@ -69,7 +69,7 @@ void FileLabelLoaderBase<checkpointing_supported>::ReadSample(ImageLabelWrapper | |||
}); | |||
Index file_size = current_file->Size(); | |||
|
|||
if (copy_read_data_ || !current_file->CanMemoryMap()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't CanMemoryMap
return false even uf copy_read_data_ is true
? What about s3://
storage? Quick scan of the code didn't reveal any tests for it, so we might silently break it, can't we?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
copy_read_data_ = dont_use_mmap_ || !mmap_reserver_.CanShareMappedData()
- https://github.com/NVIDIA/DALI/blob/main/dali/operators/reader/loader/file_label_loader.h#L137
The idea is that copy_read_data_
should reflect CanShareMappedData
value I don't think we can break s3
as it doesn't use/shouldn't use mmap. @jantonguirao ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that obtaining CanMemoryMap
on a per-file basis is safer than inferring it somehow from CanShareMappedData
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think CanShareMappedData
is a stronger check as it verifies if we can actually mmap all the files we want at the same time. This is important for single files as we can have many of them opened at a time. CanMemoryMap
is whether a given object class can do it, but not necessarily if the OS has enough free FDs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with Michal, it is required to check on a per-file basis, because even if your system allows mmaping, if you get an S3 location, you can't call Get on it (it throws)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. If we can mix plain files and S3 objects inside one reader this makes a lot of sense. Removed.
@@ -112,7 +112,7 @@ class IndexedFileLoader : public Loader<CPUBackend, IndexedFileLoaderSample, tru | |||
} | |||
next_seek_pos_ = seek_pos + size; | |||
|
|||
if (opts.use_mmap && current_file_->CanMemoryMap()) { | |||
if (!copy_read_data_) { | |||
auto p = current_file_->Get(size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the current_file
is an S3 location, then you will end up trying to call Get
, which will throw (can't be done for for S3). I believe you should keep the current_file_->CanMemoryMap() check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. If we can mix plain files and S3 objects inside one reader this makes a lot of sense. Removed.
Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
!build |
CI MESSAGE: [19618233]: BUILD STARTED |
CI MESSAGE: [19618233]: BUILD PASSED |
webdataset
Category:
Bug fix (non-breaking change which fixes an issue)
Description:
webdataset
Additional information:
Affected modules and functionalities:
Key points relevant for the review:
Tests:
Checklist
Documentation
DALI team only
Requirements
REQ IDs: N/A
JIRA TASK: N/A