-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[armhf][Nokia-7215] Enable Watchdog service #16612
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,46 @@ | ||
#!/usr/bin/python | ||
|
||
from sonic_platform.chassis import Chassis | ||
from sonic_py_common import logger | ||
import time | ||
import os | ||
import signal | ||
import sys | ||
|
||
|
||
TIMEOUT=170 | ||
KEEPALIVE=55 | ||
sonic_logger = logger.Logger('Watchdog') | ||
sonic_logger.set_min_log_priority_info() | ||
time.sleep(60) | ||
chassis = Chassis() | ||
watchdog = chassis.get_watchdog() | ||
|
||
def stopWdtService(signal, frame): | ||
watchdog._disablewatchdog() | ||
sonic_logger.log_notice("CPUWDT Disabled: watchdog armed=%s" % watchdog.is_armed() ) | ||
sys.exit() | ||
|
||
def main(): | ||
|
||
signal.signal(signal.SIGHUP, signal.SIG_IGN) | ||
signal.signal(signal.SIGINT, stopWdtService) | ||
signal.signal(signal.SIGTERM, stopWdtService) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @Pavan-Nokia can we keep the watchdog running during reboot time so that we don't ever end up in a hung situation if kernel hangs during reboot?See https://github.com/sonic-net/sonic-buildimage/blob/master/platform/broadcom/sonic-platform-modules-cel/haliburton/script/cpu_wdt#L50 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We cannot keep the watchdog active during reboot on the 7215-IXS-T1, the watchdog circuit is reset when the system is rebooted, so we have to arm it again when we come back up after reboot There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @Pavan-Nokiathen what is the point of enabling wachdog? I understand if the system is hung AFTER watchdog is enabled then it works as expected. But consider a case where system is booting up after reboot and hangs...before watchdog is enabled then system is hung and watchdog cannot bail out There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @prgeor You are right, we cannot bail out if there is a hang before the watchdog is enabled |
||
|
||
watchdog.arm(TIMEOUT) | ||
Pavan-Nokia marked this conversation as resolved.
Show resolved
Hide resolved
|
||
sonic_logger.log_notice("CPUWDT Enabled: watchdog armed=%s" % watchdog.is_armed() ) | ||
|
||
|
||
while True: | ||
time.sleep(KEEPALIVE) | ||
watchdog._keepalive() | ||
Pavan-Nokia marked this conversation as resolved.
Show resolved
Hide resolved
|
||
sonic_logger.log_info("CPUWDT keepalive") | ||
done | ||
|
||
stopWdtService | ||
|
||
return | ||
|
||
|
||
if __name__ == '__main__': | ||
main() |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
[Unit] | ||
Description=CPU WDT | ||
After=nokia-7215init.service | ||
[Service] | ||
ExecStart=/usr/local/bin/cpu_wdt.py | ||
|
||
[Install] | ||
WantedBy=multi-user.target |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,8 @@ | ||
nokia-7215_plt_setup.sh usr/sbin | ||
7215/scripts/nokia-7215init.sh usr/local/bin | ||
7215/scripts/cpu_wdt.py usr/local/bin | ||
7215/service/nokia-7215init.service etc/systemd/system | ||
7215/service/cpu_wdt.service etc/systemd/system | ||
7215/service/fstrim.timer/timer-override.conf /lib/systemd/system/fstrim.timer.d | ||
7215/sonic_platform-1.0-py3-none-any.whl usr/share/sonic/device/armhf-nokia_ixs7215_52x-r0 | ||
inband_mgmt.sh etc/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Pavan-Nokia why this sleep needed here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sonic has "watchdog-control.service" which is designed to disable the watchdog on every boot. Adding this sleep to enable the watchdog after that service has completed.
Using the "After" key word in the service file also does not help as the "after" keyword only assure that our service is started after watchdog-control.service is started and does not ensure that watchdog-control.service is completed before this.