-
Notifications
You must be signed in to change notification settings - Fork 297
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DAOS-14408 common: enable NDCTL for DCPM #14371
base: master
Are you sure you want to change the base?
Conversation
Ticket title is 'NDCTL must be enabled to provide support for RAS functionality in PMDK' |
Test stage Functional Hardware Medium UCX Provider completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14371/5/execution/node/886/log |
38bd529
to
96548d9
Compare
Test stage Build RPM on EL 9 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14371/7/execution/node/329/log |
Test stage Build RPM on EL 8 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14371/7/execution/node/366/log |
Test stage Build RPM on Leap 15.5 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14371/7/execution/node/363/log |
Test stage Build DEB on Ubuntu 20.04 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14371/7/execution/node/310/log |
Test stage Functional on EL 8.8 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14371/8/execution/node/1176/log |
Test stage Functional on EL 8.8 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14371/9/execution/node/1176/log |
Test stage Functional Hardware Medium Verbs Provider completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14371/9/execution/node/1417/log |
Test stage Functional Hardware Medium completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14371/9/execution/node/1509/log |
Test stage Functional on EL 8.8 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14371/10/execution/node/1152/log |
Test stage Functional Hardware Medium Verbs Provider completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14371/10/execution/node/1463/log |
Test stage Functional Hardware Medium completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14371/10/execution/node/1417/log |
Test stage Functional Hardware Large completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14371/9/execution/node/1601/log |
Test stage Functional Hardware Large completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14371/10/execution/node/1602/log |
1fb603d
to
cd0ed94
Compare
Test stage Build RPM on EL 8 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14371/11/execution/node/273/log |
Test stage Build RPM on EL 9 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14371/11/execution/node/367/log |
Test stage Build RPM on Leap 15.5 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14371/11/execution/node/343/log |
Test stage Build DEB on Ubuntu 20.04 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14371/11/execution/node/383/log |
a55f41f
to
e32501e
Compare
Test stage Test RPMs on EL 8.6 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14371/15/execution/node/758/log |
e32501e
to
56669f2
Compare
Test stage Functional Hardware Medium completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14371/17/execution/node/920/log |
Test stage Functional Hardware Medium Verbs Provider completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14371/18/execution/node/920/log |
Test stage Functional Hardware Medium completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14371/18/execution/node/904/log |
Test stage Functional Hardware Medium completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14371/19/execution/node/870/log |
Test stage Functional Hardware Medium completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14371/16/execution/node/968/log |
Test stage Functional Hardware Large completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14371/20/execution/node/962/log |
@grom72 Don't you also need the |
Test-tag: DaosBuild PR-repos: pmdk@PR-38:11 Priority: 2 Cancel-prev-build: false Skip-func-test-leap15: false Skip-func-test-el9: false Skip-test-leap-15.4-rpms: false Skip-test-el9-rpms: false Allow-unstable-test: true Skip-func-hw-test: true Required-githooks: true Signed-off-by: Tomasz Gromadzki <tomasz.gromadzki@intel.com>
Test stage Functional on Leap 15.5 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14371/124/execution/node/1310/log |
Test stage Functional on Leap 15.5 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14371/125/execution/node/538/log |
Test-tag: DaosBuild PR-repos: pmdk@PR-38:11 Skip-list: test_dfuse_daos_build_wt_il:DAOS-16556 Priority: 2 Cancel-prev-build: false Skip-func-test-leap15: false Skip-func-test-el9: false Skip-test-leap-15.4-rpms: false Skip-test-el9-rpms: false Allow-unstable-test: true Skip-func-hw-test: true Required-githooks: true Signed-off-by: Tomasz Gromadzki <tomasz.gromadzki@intel.com>
Test stage Functional on Leap 15.5 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14371/126/execution/node/1232/log |
Test-tag: DaosBuild PR-repos: pmdk@PR-38:11 Skip-list: test_dfuse_daos_build_wt_pil4dfs:DAOS-16556 Priority: 2 Cancel-prev-build: false Skip-func-test-leap15: false Skip-func-test-el9: false Skip-test-leap-15.4-rpms: false Skip-test-el9-rpms: false Allow-unstable-test: true Skip-func-hw-test: true Required-githooks: true Signed-off-by: Tomasz Gromadzki <tomasz.gromadzki@intel.com>
Test stage Functional on Leap 15.5 completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-14371/127/testReport/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you can see in your latest test run, DaosBuild.test_dfuse_daos_build_wt_il
failed on Leap15 with:
06:40:54 DEBUG| src/common.inc:336: *** Please install libndctl-dev/libndctl-devel/ndctl-devel >= 63. Stop.
This is because of the concern I raised previously.
I suspect you also need to add this requirement (lib{nd,dax}ctl-dev
packages) to the debian/control
file's Package: daos-client-tests
's Depends:
clause. To be sure, there lots of others missing there that have been added to the daos.spec
in the past without adding them to debian/control
but since we are identifying this particular one missing, let's take the opportunity to add it in this PR so as not to increase technical debt. Indeed, it would be nice if we were doing even minimal testing on Ubuntu to help identify these kinds of gaps as they happen.
utils/scripts/install-leap15.sh
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious why daxctl-devel
is not needed on Leap15.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any answer here @grom72?
Test-tag: DaosBuild PR-repos: pmdk@PR-38:11 Skip-list: test_dfuse_daos_build_wt_pil4dfs:DAOS-16556 Priority: 2 Cancel-prev-build: false Skip-func-test-leap15: false Skip-func-test-el9: false Skip-test-leap-15.4-rpms: false Skip-test-el9-rpms: false Allow-unstable-test: true Skip-func-hw-test: false Required-githooks: true Signed-off-by: Tomasz Gromadzki <tomasz.gromadzki@intel.com>
I wonder why we have to add PMem-related dependencies to daos-client-tests. DAOS client has nothing to do with PMem directly. |
Test-tag: DaosBuild PR-repos: pmdk@PR-38:11 Skip-list: test_dfuse_daos_build_wt_pil4dfs:DAOS-16556 Priority: 2 Cancel-prev-build: false Skip-build-ubuntu20-rpm: false Skip-build-leap15-rpm: true Skip-build-leap15-icc: true Skip-build-el9-rpm: true Skip-nlt: true Skip-unit-tests: true Skip-func-test-vm: true Skip-test-rpms: true Skip-unit-test-memcheck: true Skip-func-test: true Skip-unit-tests: true Allow-unstable-test: true Skip-func-hw-test: true Required-githooks: true Signed-off-by: Tomasz Gromadzki <tomasz.gromadzki@intel.com>
Because one of the daos (client) tests is to clone and build daos itself on daos (i.e. dfuse IIUC). It builds server+client. So the dependencies there are not actually for the client but for the building of daos. I have more than once questioned the value of building daos specifically as a test of daos. While building something with a lot of files is probably a good test of daos/dfuse, I wonder if it needs to be as complicated as daos. Maybe the Linux kernel, for example, which is entirely self-contained in it's own source tree and doesn't have a brazillion dependencies is a better project. But that's just my opinion. |
Does it mean that in the case of all client tests, we install dependences required for DAOS build ( |
Yes, Yes, it's messy.
This is a valid question and point. Fortunately the tendency is to miss |
Maybe gatekeepers will disagree but I don't think it's valid to skip all of the build and test that is required to show the validity of the PR in what you expect to be the final commit before requesting landing. |
All requested changes have been made.
PR-repos: pmdk@PR-38:11 Skip-list: test_dfuse_daos_build_wt_pil4dfs:DAOS-16556 Priority: 2 Allow-unstable-test: true Required-githooks: true Signed-off-by: Tomasz Gromadzki <tomasz.gromadzki@intel.com>
Skip-list: test_dfuse_daos_build_wt_pil4dfs:DAOS-16556 Priority: 2 Cancel-prev-build: false Allow-unstable-test: true Required-githooks: true Signed-off-by: Tomasz Gromadzki <tomasz.gromadzki@intel.com>
The last builds were only to confirm that Leap rpms/Ubuntu pkg are properly built in a test environment. The full validation has been done in the following builds: Anyhow I have triggered full validation if we want to have a consistent picture of validation in one place: |
This PR prepares DAOS to be used with NDCTL enabled in PMDK, which means:
NDCTL must not be used when non-DCPM (simulate PMem) -
storage class: "ram"
is used:PMEMOBJ_CONF=sds.at_create=0
env variable disables NDCTL features in the PMDKThe default ULT stack size must be at least 20KiB to avoid stack overuse by PMDK with NDCTL enabled and be aligned with Linux page size.
ABT_THREAD_STACKSIZE=20480
env variable is used to increase the default ULT stack size.This modification shall not affect md-on-ssd mode as long as
storage class: "ram"
is used for the first tier in thestorage
configuration.This change does not require any configuration changes to existing systems.
The new PMDK package with NDCTL enabled (daos-stack/pmdk#38) will be delivered as soon as this PR is merged and backported to stable/2.6.
Before requesting gatekeeper:
Features:
(orTest-tag*
) commit pragma was used or there is a reason documented that there are no appropriate tags for this PR.Gatekeeper: