-
Notifications
You must be signed in to change notification settings - Fork 301
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DAOS-13216 dfuse: Add a pre-read feature for non-cached files. #12015
Conversation
Bug-tracker data: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Style warning(s) for job https://build.hpdd.intel.com/job/daos-stack/job/daos/job/PR-12015/1/
Please review https://wiki.hpdd.intel.com/display/DC/Coding+Rules
Test stage checkpatch completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-12015/1/execution/node/145/log |
Test stage Build RPM on Leap 15.4 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-12015/1/execution/node/307/log |
Test stage Build RPM on EL 8 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-12015/1/execution/node/310/log |
Test stage Build DEB on Ubuntu 20.04 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-12015/1/execution/node/369/log |
eadb559
to
9561b04
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. No errors found by checkpatch.
Test stage Build RPM on Leap 15.4 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-12015/2/execution/node/333/log |
Test stage Build RPM on EL 8 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-12015/2/execution/node/330/log |
Test stage Build DEB on Ubuntu 20.04 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-12015/2/execution/node/280/log |
When the kernel cache is in use but a file is not cached then pre-read the file on open. Signed-off-by: Ashley Pittman <ashley.m.pittman@intel.com> # ------------------------ >8 ------------------------ Skip-func-hw-test: true Skip-func-test: true Quick-Functional: true Test-tag: dfuse
Test-tag: dfuse Required-githooks: true Signed-off-by: Ashley Pittman <ashley.m.pittman@intel.com>
9561b04
to
b3af6ba
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. No errors found by checkpatch.
Test-tag: dfuse Required-githooks: true Signed-off-by: Ashley Pittman <ashley.m.pittman@intel.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. No errors found by checkpatch.
Test-tag: dfuse Required-githooks: true Signed-off-by: Ashley Pittman <ashley.m.pittman@intel.com>
Test stage Functional on EL 8 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-12015/4/execution/node/1015/log |
Test-tag: dfuse Required-githooks: true Signed-off-by: Ashley Pittman <ashley.m.pittman@intel.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. No errors found by checkpatch.
Test-tag: dfuse Required-githooks: true Signed-off-by: Ashley Pittman <ashley.m.pittman@intel.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. No errors found by checkpatch.
Test stage Build on Leap 15.4 with Intel-C and TARGET_PREFIX completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-12015/6/execution/node/345/log |
Test stage Build on EL 8 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-12015/6/execution/node/371/log |
Test stage Build RPM on EL 8 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-12015/6/execution/node/366/log |
Test stage Build RPM on Leap 15.4 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-12015/6/execution/node/353/log |
Test-tag: dfuse Required-githooks: true Signed-off-by: Ashley Pittman <ashley.m.pittman@intel.com>
Test stage Functional Hardware Medium Verbs Provider completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-12015/37/execution/node/1258/log |
Features: dfuse
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. No errors found by checkpatch.
have not dug into the code yet in this PR and have just some high level questions based on the comment of pre-read you added:
|
ls -R will not call open so no, there would be no difference to that code path.
The file size is known during open and returned to the kernel. This is then used for the read size. The open performance should be unaffected (and there's no I/O changes during open).
This was tested with the Resnet training stage and helped, it should be useful for any dataset ingest which is dominated by read of smaller files. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good to me. im not clear though what happens in the case if the file was written to after the pre-read happens. Is the pre-read buffer dropped?
Features: dfuse Required-githooks: true
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. No errors found by checkpatch.
Test stage Functional Hardware Medium Verbs Provider completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-12015/39/execution/node/1292/log |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. No errors found by checkpatch.
If the pre-read buffer is used then once it's exhausted (which might take several reads depending on application I/O sizes) then it will be released. If there is other I/O, either writes or reads from elsewhere in the file then the buffer will not be used, but it won't be released until close. |
if (oh->doh_linear_read_pos != position) { | ||
DFUSE_TRA_DEBUG(oh, "disabling readahead"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So "linear read" is synonymous with "readahead"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apparently yes. This PR started are read-ahead after the first read before I realised that the speculative read could happen on open. This is debugging so I think it's Ok though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking from the perspective of the variable names, etc., but not asking you to change this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/daos-stack/daos/blob/master/src/client/dfuse/dfuse.h#L117-L126
linear read
means the file was opened, read from start to end without seeking and closed. We already track this to short-circuit the last zero-length read so we use that data to know if to use the pre-read buffer.
Test stage Functional Hardware Medium Verbs Provider completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-12015/40/execution/node/1268/log |
Test stage Functional Hardware Medium completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-12015/40/testReport/ |
Test stage Functional Hardware Large completed with status FAILURE. https://build.hpdd.intel.com/job/daos-stack/job/daos/job/PR-12015/40/display/redirect |
Required-githooks: true
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. No errors found by checkpatch.
Test stage Functional Hardware Medium Verbs Provider completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-12015/41/execution/node/1267/log |
Required-githooks: true
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. No errors found by checkpatch.
When the kernel cache is in use but a file is not cached then pre-read the file on open. This works for files up to the read buffer size (1Mb) and is enabled based on the I/O pattern of the last file closed in the same directory. Required-githooks: true Signed-off-by: Ashley Pittman <ashley.m.pittman@intel.com> Signed-off-by: Jeff Olivier <jeffolivier@google.com>
When the kernel cache is in use but a file is not cached
then pre-read the file on open.
Required-githooks: true
Signed-off-by: Ashley Pittman ashley.m.pittman@intel.com