-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change the default number of Kernel dumps to 3 #20647
base: master
Are you sure you want to change the base?
Conversation
@prgeor , Pls review |
@abdosi , Pls check |
@saiarcot895 : can you please review this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't appear to be needed. On a device where kdump is disabled, running sudo config kdump enable
modifies /etc/default/kdump-tools
and sets KDUMP_NUM_DUMPS
to 3
.
The above assumes that the kernel dump is explicitly enabled by CLI. We have enabled kdump by default in this PR for Cisco platforms. |
@saiarcot895 Pls check the response above. |
@bmridul That's because hostcfgd is not applying the changes from the default state to the runtime configuration. Please modify hostcfgd to add proper support for this. |
Ack. I will check. |
Why I did it
Currently there is no limit on the number of kernel dumps that will be captured in the system. This leads to excessive disk space usage if the system encounters many kernel crashes (e.g. as part of sonic-mgmt test suite runs).
According to the HLD, the default number of kdumps should be 3. However the fix is missing in code.
https://github.com/sonic-net/SONiC/blob/master/doc/kdump/SONiC-kdump.md#config-kdump-num_dumps-number
This PR is providing the fix.
Work item tracking
How I did it
Set the number of kernel dumps to 3 in /etc/default/kdump-tools
How to verify it
UT Log:
root@sonic:/home/cisco# show reboot h
Name Cause Time User Comment
2024_08_26_19_35_54 Kernel Panic Mon Aug 26 07:32:18 PM UTC 2024 N/A N/A
2024_08_26_19_29_47 Kernel Panic Mon Aug 26 07:26:16 PM UTC 2024 N/A N/A
2024_08_26_19_04_03 Kernel Panic Mon Aug 26 07:00:36 PM UTC 2024 N/A N/A
2024_08_26_18_54_39 Kernel Panic Mon Aug 26 06:51:35 PM UTC 2024 N/A N/A
2024_08_26_18_43_13 reboot Mon Aug 26 06:36:53 PM UTC 2024 cisco N/A
...
root@sonic:/home/cisco# show kdump files
Kernel core dump files Kernel dmesg files
/var/crash/202408261932/kdump.202408261932 /var/crash/202408261932/dmesg.202408261932
/var/crash/202408261926/kdump.202408261926 /var/crash/202408261926/dmesg.202408261926
/var/crash/202408261900/kdump.202408261900 /var/crash/202408261900/dmesg.202408261900
root@sonic:/home/cisco# ls /var/crash/
202408261900 202408261926 202408261932 kdump_lock kexec_cmd
Which release branch to backport (provide reason below if selected)
Tested branch (Please provide the tested image version)
Description for the changelog
Set the number of kernel dumps to 3 in /etc/default/kdump-tools
Link to config_db schema for YANG module changes
N/A
A picture of a cute animal (not mandatory but encouraged)