-
Notifications
You must be signed in to change notification settings - Fork 301
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DAOS-15002 test: use default pool svc for rebuild tests #13648
Conversation
Bug-tracker data: |
57c8958
to
5b35344
Compare
Test stage Functional on EL 8.8 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-13648/2/execution/node/789/log |
a0c39e2
to
8299aac
Compare
Test stage Functional on EL 8.8 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-13648/5/execution/node/789/log |
b376689
to
7791a24
Compare
Test stage Functional on EL 8.8 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-13648/7/execution/node/812/log |
When possible, use the system default of svc_rf * 2 + 1 for rebuild tests instead of hardcoding to inappropriate values. Otherwise, update some tests with svnc: 5 to account for killing a rank. Test-tag: RbldCascadingFailures RbldDeleteObjects RbldReadArrayTest RbldContRfTest RbldWidelyStriped RbldWithIOR EcodOfflineRebuildSingle EcodOnlineMultFail EcodOfflineRebuild EcodOfflineRebuildSingle EcodOnlineRebuild EcodDisabledRebuildSingle EcodDisabledRebuild ServerRankFailure ContRfEnforce Test-repeat: 1 Skip-unit-tests: true Skip-fault-injection-test: true Required-githooks: true Signed-off-by: Dalton Bohning <dalton.bohning@intel.com>
7791a24
to
f29cc34
Compare
Test stage Functional on EL 8.8 completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-13648/8/testReport/ |
Got a clean run with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Dalton
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, Dalton.
svcn: 3 | ||
control_method: dmg | ||
size: 1G | ||
svcn: 7 # To match number of servers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic should be: "This test kills 3 engines." => "It needs at least 3 * 2 + 1 = 7 PS replicas to avoid losing the PS." => "It needs at least 7 engines for 7 PS replicas."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. This test has some other issues I need to fix as part of https://daosio.atlassian.net/browse/DAOS-15074.
I want to make this more flexible then
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@daltonbohning, sorry for the ambiguity. I mean the comment "to match number of servers" is not accurate; it should be something along the line of "to allow killing 3 engines". That is, the number of PS replicas depends on the number of engines the test kills, and the number of engines depends on the number of PS replicas; not the other way around.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No worries, I understand! I also set the svcn to exactly the number of servers so all ranks will be svc ranks, making this test a little more deterministic until I fix it properly. E.g. there are some cases where I think the test needs to look at which ranks are svc ranks before killing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good! I think "7 engines and 7 PS replicas" is good enough, for there's no external interface for knowing the current, exact set of PS replicas.
Test-tag: RbldCascadingFailures RbldDeleteObjects RbldReadArrayTest RbldContRfTest RbldWidelyStriped RbldWithIOR EcodOfflineRebuildSingle EcodOnlineMultFail EcodOfflineRebuild EcodOfflineRebuildSingle EcodOnlineRebuild EcodDisabledRebuildSingle EcodDisabledRebuild ServerRankFailure ContRfEnforce Test-repeat: 1 Skip-unit-tests: true Skip-fault-injection-test: true When possible, use the system default of svc_rf * 2 + 1 for rebuild tests instead of hardcoding to inappropriate values. Otherwise, update some tests with svnc: 5 to account for killing a rank. Required-githooks: true Signed-off-by: Dalton Bohning <dalton.bohning@intel.com>
Test-tag: RbldCascadingFailures RbldDeleteObjects RbldReadArrayTest RbldContRfTest RbldWidelyStriped RbldWithIOR EcodOfflineRebuildSingle EcodOnlineMultFail EcodOfflineRebuild EcodOfflineRebuildSingle EcodOnlineRebuild EcodDisabledRebuildSingle EcodDisabledRebuild ServerRankFailure ContRfEnforce Test-repeat: 1 Skip-unit-tests: true Skip-fault-injection-test: true When possible, use the system default of svc_rf * 2 + 1 for rebuild tests instead of hardcoding to inappropriate values. Otherwise, update some tests with svnc: 5 to account for killing a rank. Required-githooks: true Signed-off-by: Dalton Bohning <dalton.bohning@intel.com>
When possible, use the system default of svc_rf * 2 + 1 for rebuild tests instead of hardcoding to inappropriate values.
Otherwise, update some tests with svnc: 5 to account for killing a rank.
Test-tag: RbldCascadingFailures RbldDeleteObjects RbldReadArrayTest RbldContRfTest RbldWidelyStriped RbldWithIOR
Test-repeat: 2
Skip-unit-tests: true
Skip-fault-injection-test: true
Required-githooks: true
Before requesting gatekeeper:
Features:
(orTest-tag*
) commit pragma was used or there is a reason documented that there are no appropriate tags for this PR.Gatekeeper: