-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: recover from configuration failures #151
base: main
Are you sure you want to change the base?
Conversation
cf514c0
to
bf0b38e
Compare
Unit Tests Coverage Report
Minimum allowed coverage is Generated by 🐒 cobertura-action against d88c097 |
private CDAConfiguration cdaConfiguration; | ||
private volatile boolean configurationErrored = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if this is necessary at all, but future reference, you could just synchronize on configurationErrored
instead and then you also don't need volatile
// service to error, but we don't want to check again until the nucleus has run the remediation steps (when the | ||
// service errors, the nucleus will try to call shutdown -> install -> startup). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you help me understand this? My mental model is that we shouldn't need to care about what Nucleus is doing, and that we can rely entirely on configuration updates. If we are broken, and a configuration change fixes us, then request reinstall? Are there scenarios where that will not work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, it should be possible to make this as simple as, on config change, if state is broken, then request reinstall.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jbutler and I chatted about this offline - we are going to dig deeper and see if it is required
synchronized (configLock) { | ||
if (configurationErrored) { | ||
configurationErrored = false; | ||
onConfigurationChanged(); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought the issue was that if we are broken, then Nucleus doesn't call startup
?
Description of changes:
Enables CDA to recover from certificateAuthority configuration failures with requiring a manual restart.
Why is this change necessary:
Without this when a customer provides a bad config the component will error and can't recover without un installing and re installing
How was this change tested:
Integration test