-
Notifications
You must be signed in to change notification settings - Fork 301
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DAOS-14149 client: add compatible mode for libpil4dfs #13294
Conversation
Test-tag: pil4dfs avoid using fake fd to get better compatibility with degraded performance in open and opendir etc. Required-githooks: true Signed-off-by: Lei Huang <lei.huang@intel.com>
Bug-tracker data: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. No errors found by checkpatch.
Test-tag: pil4dfs Required-githooks: true Signed-off-by: Lei Huang <lei.huang@intel.com>
Test-tag: pil4dfs Required-githooks: true Signed-off-by: Lei Huang <lei.huang@intel.com>
Features: pil4dfs Required-githooks: true Signed-off-by: Lei Huang <lei.huang@intel.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. No errors found by checkpatch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. No errors found by checkpatch.
Features: pil4dfs Required-githooks: true Signed-off-by: Lei Huang <lei.huang@intel.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. No errors found by checkpatch.
Features: pil4dfs Required-githooks: true Signed-off-by: Lei Huang <lei.huang@intel.com>
Test stage Unit Test bdev on EL 8 completed with status FAILURE. https://build.hpdd.intel.com/job/daos-stack/job/daos/job/PR-13294/5/display/redirect |
Test stage NLT on EL 8 completed with status FAILURE. https://build.hpdd.intel.com/job/daos-stack/job/daos/job/PR-13294/5/display/redirect |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. No errors found by checkpatch.
Features: pil4dfs Required-githooks: true Signed-off-by: Lei Huang <lei.huang@intel.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. No errors found by checkpatch.
We may need a separate ticket and PR for more tests. |
if (rc != 0) { | ||
rc = errno; | ||
if (rc != ENOTTY) | ||
DS_WARN(rc, "ioctl call on %d failed", fd); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
most of those sounds like legitimate errors and not warnings?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even fetch_dfs_cont_file_obj_with_fd() failed, open() still can return the fd allocated by kernel. If you think error is more appropriate here, I can change it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was more of a question to you..
Warning messages should be used when something that the user is expecting but another path was taken and the operation is still onging / succeeds.
Error messages are for when we get legitimate errors from DAOS that something went wrong like EIO, etc.
otherwise, like if user is opening a non-existent file, this is more of a debug message that can be logged at the DAOS log level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok. I will change this and several other warnings in this function to debug message. Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should probably print the errno (or string version of it here), no? (applies to subsequent messages).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DS_WARN does print errno and string version message.
fetch_dfs_cont_file_obj_with_fd() is called inside open. When it is called, we already get fd from dfuse without issue. Even fetch_dfs_cont_file_obj_with_fd fails, we still return fd allocated by dfuse. I think warning is appropriate here.
src/client/dfuse/pil4dfs/int_dfs.c
Outdated
@@ -850,6 +979,13 @@ retrieve_handles_from_fuse(int idx) | |||
errno_saved, strerror(errno_saved)); | |||
goto err; | |||
} | |||
rc = dc_cont_hdl2uuid(dfs_list[idx].coh, NULL, &dfs_list[idx].uuid_cont); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ideally we don't want to use internal functions like these here. is this for the UNS support?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok. I can remove this for now. It is not for UNS support. I just used the container uuid to double check the container uuid returned by ioctl(fd, DFUSE_IOCTL_IL, &il_reply).
src/tests/ftest/dfuse/bash.py
Outdated
@@ -212,3 +212,18 @@ def test_bashcmd_pil4dfs(self): | |||
:avocado: tags=Cmd,test_bashcmd_pil4dfs | |||
""" | |||
self.run_bashcmd(il_lib="libpil4dfs.so") | |||
|
|||
def test_bashcmd_pil4dfs_compatible(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need to add some more testing for the most common use cases that we know are not supported currently that will be with compatibility mode.
src/client/dfuse/pil4dfs/int_dfs.c
Outdated
|
||
rc = dfs_get_mode(dfs_obj_local, &mode_query); | ||
if (rc) | ||
goto out_compatible; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need to close dfs_obj_local?
error ignored?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. "dfs_obj_local" should be closed. I will add a debug message when dfs_get_mode() fails. Thank you!
Test stage Functional Hardware Medium Verbs Provider completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-13294/7/execution/node/1511/log |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was chatting with Johann last week and he had an interesting idea.
If we have a special file, let's say .fd_alloc, where dfuse does nothing but just allow the kernel to create a new file descriptor, then we can use ioctl to map that file descriptor to our real dfs file. Then we have a real file descriptor to return to the application. If we ever encounter a situation that is difficult to implement at least in the near term (e.g. mmap), we can simply disable interception and defer back to dfuse which now knows the mapping between the fd and the dfs file.
so open handling would be something like
Open the dfs file
If not successful
return
fd = real_open("/path/to/dfuse/.fd_alloc", ...);
ioctl(fd, DFUSE_MAP_FILE, ....);
save mapping from fd to dfs_obj
return fd;
if (rc != 0) { | ||
rc = errno; | ||
if (rc != ENOTTY) | ||
DS_WARN(rc, "ioctl call on %d failed", fd); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should probably print the errno (or string version of it here), no? (applies to subsequent messages).
docs/user/filesystem.md
Outdated
@@ -974,6 +974,10 @@ libpil4dfs intercepting summary for ops on DFS: | |||
[op_sum ] 5003 | |||
``` | |||
|
|||
### Turn on compatible mode in libpil4dfs | |||
|
|||
"D_IL_COMPATIBLE=1" or "D_IL_COMPATIBLE=true" turns on compatible mode. Fake fd will not be returned to applications. This mode provides better compatibility with degraded performance in open, openat, and opendir, etc. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would argue this should be the default setting and the performant version should be a use at your own risk option. Though I think we can probably make this better if we do something like opening the special fake file on the actual fuse mountpoint to avoid lookup calls and maybe acceptable enough that we don't even need this environment variable at all.
Anyway, my reasoning here is that people will not care about a small bump in performance if it means half of their applications don't run at all. I don't think open is as big of a bottleneck as many other metadata ops
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. I can set compatible mode as default in next commit. Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure i agree actually. I do not think the performance effect is small but can be quite substantial in metadata heavy cases.
ideally these use cases should be found and eventually resolved. if we set this as the default, im afraid we are going to get not reports of new issues found, and more importantly have performance issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if we move to fake file method, perf difference will be minimal, IMO. The issue with the current version, is it will cause a bunch of lookups for items in path whereas a fake file will minimize that churn.
The main issue I have with not being compatible by default, is that it may take significant cycles for someone to discover this and their first experience will be that DAOS doesn't work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mentioned this to Ashley and he had an even better idea, IMO. If you allocate a bunch of these at startup, you can just keep a cache of such kernel allocated file descriptors and assign them to real files as needed.
Signed-off-by: Lei Huang <lei.huang@intel.com>
Test-provider-hw-medium: ofi+tcp;ofi_rxm Priority: 2 Required-githooks: true Signed-off-by: Lei Huang <lei.huang@intel.com>
ok. I will test it in bash then. I also need to expose an env to enable/disable auto-disabling dir caching in bash. |
Test stage Functional Hardware Medium Verbs Provider completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-13294/43/testReport/ |
All CI tests passed except one failure. It was reported in ticket, https://daosio.atlassian.net/issues/DAOS-15662. |
2. add a test to create dir and file, then remove them and recreate them. Priority: 2 Test-tag: test_bashdcache_pil4dfs Skip-unit-tests: true Required-githooks: true Signed-off-by: Lei Huang <lei.huang@intel.com>
SConstruct
Outdated
@@ -363,7 +363,8 @@ MINIMAL_ENV = ('HOME', 'TERM', 'SSH_AUTH_SOCK', 'http_proxy', 'https_proxy', 'PK | |||
|
|||
# Environment variables that are also kept when LD_PRELOAD is set. | |||
PRELOAD_ENV = ('LD_PRELOAD', 'D_LOG_FILE', 'DAOS_AGENT_DRPC_DIR', 'D_LOG_MASK', 'DD_MASK', | |||
'DD_SUBSYS', 'D_IL_MAX_EQ', 'D_IL_ENFORCE_EXEC_ENV', 'D_IL_COMPATIBLE') | |||
'DD_SUBSYS', 'D_IL_MAX_EQ', 'D_IL_ENFORCE_EXEC_ENV', 'D_IL_COMPATIBLE', | |||
'D_IL_NO_DCACHE_BASH') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
im going to object on adding more env variables like this.. it's counter productive for users and hard to keep track of. if we know this has issues, then why we need a flag to enable it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about if compatible mode is on, dcache in bash is disabled.
otherwise it's enabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another option would be to have a D_IL_NO_CACHE_EXE
and have it be a comma separated list of process named with the default being sh,bash
, that way the default would work and not need a hack in this file and there would be the option to enable/disable it for other procs as needed.
This shouldn't hold up the PR from landing however, the previous code is good enough to pass the test and land the PR IMO and we should improve this in a follow up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok. I can add an env "D_IL_NO_CACHE_EXE" to support user supplied app list in a follow up PR.
I think we should disable dcahe in bash in regular mode too. bash is not sensitive to performance. Disabling cache can avoid possible consistency issue.
@ashleypittman Shall I remove "'git -C {} checkout lei/DAOS-14149'.format(build_dir)," in src/tests/ftest/dfuse/daos_build.py now? |
2. fix tags in test_bashdcache_pil4dfs Priority: 2 Test-tag: test_bashdcache_pil4dfs Skip-unit-tests: true Required-githooks: true Signed-off-by: Lei Huang <lei.huang@intel.com>
Features: pil4dfs Priority: 2 Required-githooks: true Signed-off-by: Lei Huang <lei.huang@intel.com>
The new ftest passed in previous build. |
src/tests/ftest/dfuse/bash_dcache.py
Outdated
|
||
|
||
class DFuseBashdcacheTest(DfuseTestBase): | ||
"""Base Bashdcache test class. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to resolve or silence these warnings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgot to fix this in last commit. I will add "Bashdcache" into utils/cq/words.dict after current build completes. Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added "#pylint: disable-next=wrong-spelling-in-docstring" to suppress the warning. Thank you!
result = run_remote(self.log, self.hostlist_clients, env_str + cmd) | ||
if not result.passed: | ||
self.fail(f'"{cmd}" failed on {result.failed_hosts}') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you check the output?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated as you suggested. Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This checks the output but both files have the same contents so it's not checking as much as it could.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok. I can write difference content for the second write. Will update it in another PR. Thank you!
Features: pil4dfs Priority: 2 Required-githooks: true Signed-off-by: Lei Huang <lei.huang@intel.com>
Test stage Functional on EL 8.8 completed with status FAILURE. https://build.hpdd.intel.com/job/daos-stack/job/daos/job/PR-13294/46/display/redirect |
A few minor lint related changes will be needed after current build is finished. wrong-spelling-in-docstring, Wrong spelling of a word 'Bashdcache' in a docstring: @ashleypittman Is there anything else we should change? Thank you! |
@ashleypittman All tests finished in CI without issue in build 49 https://build.hpdd.intel.com/job/daos-stack/job/daos/job/PR-13294/49/. Any more comments? I will push another commit to fix pylint related issue soon. Thank you! |
Priority: 2 Test-tag: test_bash_dcache_pil4dfs Skip-unit-tests: true Required-githooks: true Signed-off-by: Lei Huang <lei.huang@intel.com>
Priority: 2 Test-tag: test_bash_dcache_pil4dfs Skip-unit-tests: true Required-githooks: true Signed-off-by: Lei Huang <lei.huang@intel.com>
Priority: 2 Test-tag: test_bash_dcache_pil4dfs Skip-unit-tests: true Required-githooks: true Signed-off-by: Lei Huang <lei.huang@intel.com>
@ashleypittman @mchaarawi Do you have more feedback? Thank you! |
result = run_remote(self.log, self.hostlist_clients, env_str + cmd) | ||
if not result.passed: | ||
self.fail(f'"{cmd}" failed on {result.failed_hosts}') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This checks the output but both files have the same contents so it's not checking as much as it could.
:avocado: tags=daosio,dfuse,il,dfs,pil4dfs | ||
:avocado: tags=DaosBuild,test_dfuse_daos_build_wt_pil4dfs | ||
""" | ||
self.run_build_test("nocache", il_lib='libpil4dfs.so', run_on_vms=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume this doesn't work with anything other than nocache
as dfuse would cache negative dentries and then lots of things would fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. That's what I found in local test.
Regular mode of the interception library (libpil4dfs) uses fake file descriptors (fd) allocated in user space. In case of some libc functions are not intercepted, applications could get errors or even crash due to the fake fd. Compatibility mode is introduced to alleviate such issues and increase compatibility. open, openat, and opendir etc rely on dfuse to get real fd from dfuse and return them to applications with better compatibility but with some degraded performance. Environmental variable "D_IL_COMPATIBLE=1" turns on compatible mode. Regular mode is set as default if "D_IL_COMPATIBLE" is unset. Signed-off-by: Lei Huang <lei.huang@intel.com>
Regular mode of the interception library libpil4dfs uses fake file descriptors (fd) allocated in user space. In case of some libc functions are not intercepted, applications could get error even crash due to fake fd. Compatibility mode is introduced to alleviate such issues and increase compatibility. open, openat, and opendir etc rely on dfuse to get real fd from dfuse and return them to applications with better compatibility and degraded performance.
Environmental variable "D_IL_COMPATIBLE=1" turns on compatible mode. Regular mode is set as default if "D_IL_COMPATIBLE" is unset.
Required-githooks: true
Before requesting gatekeeper:
Features:
(orTest-tag*
) commit pragma was used or there is a reason documented that there are no appropriate tags for this PR.Gatekeeper: