Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use upholds to ensure determine service startes after process service. #100

Closed
wants to merge 1 commit into from

Conversation

xincunli-sonic
Copy link
Contributor

@xincunli-sonic xincunli-sonic commented Jan 6, 2024

Signed-off-by: Xincun Li xincun.li@microsoft.com

Why I did it

To enhance system stability and optimize inter-service dependencies, I removed the determine-reboot-cause's dependency on database.service and improved process-reboot-cause's dependency on database.service. This was achieved using the upholds feature in Debian 12#249, ensuring a more reliable and efficient operational environment.

MSFT ADO: 26209780

How I did it

  1. Remove determine-reboot-cause's dependency with database.service.
  2. Improves process-reboot-cause dependency with database.service by upholds feature in Debian 12#249

https://manpages.debian.org/experimental/systemd/systemd.unit.5.en.html

Upholds=
Configures dependencies similar to Wants=, but as long as this unit is up, all units listed in Upholds= are started whenever found to be inactive or failed, and no job is queued for them. While a Wants= dependency on another unit has a one-time effect when this units started, a Upholds= dependency on it has a continuous effect, constantly restarting the unit if necessary. This is an alternative to the Restart= setting of service units, to ensure they are kept running whatever happens. The restart happens without delay, and usual per-unit rate-limit applies.

When Upholds=b.service is used on a.service, this dependency will show as UpheldBy=a.service in the property listing of b.service.

Added in version 249.

How to verify it
Steps:

  1. Without changes: Show reboot-cause history, check output.
  2. Apply the change
  3. Restart db docker, restart process reboot cause.
  4. Verify the Show reboot-cause history, check output.
  5. list dependency output.

admin@str3-msn4600c-acs-05:/etc/systemd/system$ sudo systemctl list-dependencies process-reboot-cause
process-reboot-cause.service
● ├─database.service
● ├─determine-reboot-cause.service
● ├─system.slice
● └─sysinit.target
● ├─apparmor.service
● ├─dev-hugepages.mount
● ├─dev-mqueue.mount
● ├─haveged.service
● ├─kmod-static-nodes.service
● ├─proc-sys-fs-binfmt_misc.automount
● ├─sys-fs-fuse-connections.mount
● ├─sys-kernel-config.mount
● ├─sys-kernel-debug.mount
● ├─sys-kernel-tracing.mount
● ├─systemd-ask-password-console.path
● ├─systemd-binfmt.service
○ ├─systemd-firstboot.service
● ├─systemd-journal-flush.service
● ├─systemd-journald.service
○ ├─systemd-machine-id-commit.service
● ├─systemd-modules-load.service
○ ├─systemd-pcrphase-sysinit.service
○ ├─systemd-pcrphase.service
○ ├─systemd-pstore.service
● ├─systemd-random-seed.service
○ ├─systemd-repart.service
● ├─systemd-sysctl.service
● ├─systemd-sysusers.service
● ├─systemd-tmpfiles-setup-dev.service
● ├─systemd-tmpfiles-setup.service
● ├─systemd-udev-trigger.service
● ├─systemd-udevd.service
● ├─systemd-update-utmp.service
● ├─cryptsetup.target
● ├─integritysetup.target
● ├─local-fs.target
● │ └─systemd-remount-fs.service
● ├─swap.target
● └─veritysetup.target

After=database.service determine-reboot-cause.service
Upholds=database.service determine-reboot-cause.service
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm understanding Upholds= correctly, this would mean that if database.service fails to start up for whatever reason, then this service can still start, since there's no longer a hard dependency on the database container. However, it looks like process-reboot-cause writes to state DB, which would then fail, and then cause this service to fail.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is expected, If the database service did not start correctly, the process-reboot-cause will not start correctly which will failing in read db as expected, so it will not persist any cause into db until next reboot and database service started correctly.

@prgeor
Copy link
Contributor

prgeor commented Mar 29, 2024

@xincunli-sonic PR description is incorrect. The issue need to be fixed is #88. Description should be "Restart process-reboot-cause and determine-reboot-cause services automatically whenever database service restarts"

Consider a case where lets say database service fails to start for whatever reason, now determine-reboot cause and process-reboot-cause does not start because database service is not started i.e process-reboot-cause and determine-reboot case remains in dependency failed state. After sometime, database service is restarted, then process-reboot-cause and determine-reboot-cause must start automatically.

@prgeor prgeor self-requested a review March 29, 2024 14:09
Copy link
Contributor

@prgeor prgeor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xincunli-sonic can you make requested changes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants