Fixed determine/process reboot-cause service dependency #17327
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Why I did it
Fixes #16990 for 202205 branch
Note: This PR is a 202205 adaptation of sonic-net/sonic-host-services#88 fix for master. The fix in master removes the dependency of determine-reboot-cause on database service. To minimize the effects on the dependency tree, this PR is modified to handle the determine-reboot-cause restart ability with script-side changes.
determine-reboot-cause and process-reboot-cause service does not start If the database service fails to restart in the first attempt. Even if the Database service succeeds in the next attempt, these reboot-cause services do not start.
The process-reboot-cause service also does not restart if the docker or database service restarts, which leads to an empty reboot-cause history
deploy-mg from sonic-mgmt also triggers the docker service restart. The restart of the docker service caused the issue stated in 2 above. The docker restart also triggers determine-reboot-cause to restart which creates an additional reboot-cause file in history and modifies the last reboot-cause.
This PR fixes these issues by making both processes start again when dependency meets after dependency failure, making both processes restart when the database service restarts, and preventing duplicate processing of the last reboot reason.
Work item tracking
How I did it
How to verify it
On single asic pizza box:
On Chassis:
Let database service on LC fail the first time. determine-reboot-cause and process-reboot-cause would fail to start due to dependency failure
start database-chassis on Supervisor. Database service on LC should now start successfully.
Verify determine-reboot-cause and process-reboot-cause also starts
Verify show reboot-cause history output
Which release branch to backport (provide reason below if selected)
Tested branch (Please provide the tested image version)
Description for the changelog
Link to config_db schema for YANG module changes
A picture of a cute animal (not mandatory but encouraged)