-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Step Metadata Update on Index Rollover Timeout #1174
Step Metadata Update on Index Rollover Timeout #1174
Conversation
Signed-off-by: Kaushik <harycash@amazon.com>
.copy(actionMetaData = currentActionMetaData?.copy(failed = true), info = info), | ||
) | ||
|
||
val updatedMetaData = managedIndexMetaData.copy( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we can rename this to updatedIndexMetaData
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I've made the change and updated the PR.
thanks @harshitakaushik-dev for raising this! |
@@ -56,6 +56,7 @@ abstract class Step(val name: String, val isSafeToDisableOn: Boolean = true) { | |||
CONDITION_NOT_MET("condition_not_met"), | |||
FAILED("failed"), | |||
COMPLETED("completed"), | |||
TIMED_OUT("timed_out"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a new field could cause problem when cluster having old and new version of code during upgrade. Maybe just use the "failed" state is good enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @bowenlan-amzn for taking a look. Not sure if I fully understand this. What could be that scenario where adding a new step state can fail. I'm asking in case we need to add a new step state in the future.
PS: I'm okay with failed as well for this case. And in the failure message, we would have timed out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's my thinking: this new field of Step status is saved in the metadata document of ISM system index. When user do a explain API using the old node with old software, it cannot understand this new field value so probably fail.
So the impact is not big, only during the upgrading when cluster is mixed with old and new software.
I just found this potential problem is already safened, new software won't be used if old software exists in cluster
Lines 229 to 232 in 4d8ef69
if (skipExecFlag.flag) { | |
logger.info("Cluster still has nodes running old version ISM plugin, skip execution on new nodes until all nodes upgraded") | |
return@launch | |
} |
So no problem now 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is awesome @bowenlan-amzn
thanks for checking
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1174 +/- ##
============================================
- Coverage 75.34% 66.29% -9.05%
+ Complexity 2812 2406 -406
============================================
Files 367 367
Lines 17038 17039 +1
Branches 2370 2372 +2
============================================
- Hits 12837 11296 -1541
- Misses 2901 4596 +1695
+ Partials 1300 1147 -153 ☔ View full report in Codecov by Sentry. |
Signed-off-by: harycash <harycash@amazon.com>
@bowenlan-amzn would you please do the final reviews so we can merge it. Thank you! |
Thanks for the contribution! @harshitakaushik-dev |
d4ee795
into
opensearch-project:main
The backport to
To backport manually, run these commands in your terminal: # Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/index-management/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/index-management/backport-2.x
# Create a new branch
git switch --create backport/backport-1174-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 d4ee795e22f4490b78662f171f62d566a81c1abc
# Push it to GitHub
git push --set-upstream origin backport/backport-1174-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/index-management/backport-2.x Then, create a pull request where the |
@harshitakaushik-dev the backport failed, could you backport manually to 2.x |
Co-authored-by: Harshita Kaushik <harycash@amazon.com>
Co-authored-by: Harshita Kaushik <harycash@amazon.com> Signed-off-by: harycash <harycash@amazon.com>
Issue #1132 :
Issue Description:
This pull request addresses a bug in the Index Management plugin where the step metadata fields within IndexActionMetadata are not updated when a rollover action times out. Currently, only the info.message field is updated to indicate a timeout, which can be misleading if the previous run did not meet the rollover conditions.
Expected Behaviour:
Upon a timeout during the rollover action, the step.status and step.name fields within IndexActionMetadata should be updated to reflect the timeout event. This provides a clearer picture of the action's execution and avoids confusion regarding successful completion.
CheckList:
Proposed Changes:
Testing:
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.