-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[VOQ][saidump] Install rdbtools into the docker base related containers. #16466
Conversation
• Saidump for DNX-SAI sonic-net#13561 Solution and modification: To use the redis-db SAVE option to save the snapshot of DB each time and recover later, instead of looping through each entry in the table and saving it. (1) Updated sonic-buildimage/build_debian.sh, to install Python library rdbtools into the host. (2) Updated sonic-buildimage/src/sonic-sairedis/saidump/saidump.cpp, add a new option -r, which updates the rdbtools's output-JSON files' format. (3) Add a new script file: files/scripts/saidump.sh, to do the below steps For each ASIC0, such as ASIC0, 1. Save the Redis data. sudo sonic-db-cli -n asic$1 SAVE > /dev/null 2. Move dump files to /var/run/redisX/ docker exec database$1 sh -c "mv /var/lib/redis/dump.rdb /var/run/redis$1/" 3. Run rdb command to convert the dump files into JSON files sudo python /usr/local/bin/rdb --command json /var/run/redis$1/dump.rdb | sudo tee /var/run/redis$1/dump.json > /dev/null 4. Run saidump -r to update the JSON files' format as same as the saidump before. Then we can get the saidump result in standard output. docker exec syncd$1 sh -c "saidump -r /var/run/redis$1/dump.json" 5. clear sudo rm -f /var/run/redis$1/dump.rdb sudo rm -f /var/run/redis$1/dump.json (4) Update sonic-buildimage/src/sonic-utilities/scripts/generate_dump, replace saidump with saidump.sh
• Saidump for DNX-SAI sonic-net#13561 Solution and modification: To use the redis-db SAVE option to save the snapshot of DB each time and recover later, instead of looping through each entry in the table and saving it. (1) Updated sonic-buildimage/build_debian.sh, to install Python library rdbtools into the host. (2) Updated sonic-buildimage/src/sonic-sairedis/saidump/saidump.cpp, add a new option -r, which updates the rdbtools's output-JSON files' format. (3) Add a new script file: files/scripts/saidump.sh, to do the below steps For each ASIC0, such as ASIC0, 1. Save the Redis data. sudo sonic-db-cli -n asic$1 SAVE > /dev/null 2. Move dump files to /var/run/redisX/ docker exec database$1 sh -c "mv /var/lib/redis/dump.rdb /var/run/redis$1/" 3. Run rdb command to convert the dump files into JSON files sudo python /usr/local/bin/rdb --command json /var/run/redis$1/dump.rdb | sudo tee /var/run/redis$1/dump.json > /dev/null 4. Run saidump -r to update the JSON files' format as same as the saidump before. Then we can get the saidump result in standard output. docker exec syncd$1 sh -c "saidump -r /var/run/redis$1/dump.json" 5. clear sudo rm -f /var/run/redis$1/dump.rdb sudo rm -f /var/run/redis$1/dump.json (4) Update sonic-buildimage/src/sonic-utilities/scripts/generate_dump, replace saidump with saidump.sh
files/scripts/saidump.sh
Outdated
@@ -0,0 +1,39 @@ | |||
#!/bin/bash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please move this file to syncd/scripts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Lguohan, there is an issue that's hard for putting saidump.sh in the syncd docker. Because the below command is not available inside the syncd docker due to the isolation between docker containers. I.e., the dump.rdb generated by the Redis save is invisible in the syncd container. So, one reasonable solution is to be leaving it in the host. The host has access to operate the dump.rdb.
debug "saidump.sh: [2] Move dump.rdb to /var/run/redis$1/ in container database$1."
docker exec database$1 sh -c "mv /var/lib/redis/dump.rdb /var/run/redis$1/"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lguohan , please help to review it again. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the explanation. I feel we should still move this saidump.sh to syncd docker, to address the problem you listed, it is better to mount a shared folder between database docker and syncd docker, so that you can see the rdb file in syncd docker once dumped.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your comments! I will fix it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i should have be more clear. when i say syncd docker, i really means sairedis repo, so that saidump.sh will be part of the sairedis package.
@anamehra for viz. Please check |
files/scripts/saidump.sh
Outdated
|
||
save_saidump_by_rdb() { | ||
debug "saidump.sh: [1] sonic-db-cli -n asic$1 SAVE." | ||
sudo sonic-db-cli -n asic$1 SAVE > /dev/null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this command sonic-db-cli -n asic$1
may not work for single asic linecard?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix it.
files/scripts/saidump.sh
Outdated
NUM_ASICS=`python -c 'from sonic_py_common.multi_asic import get_num_asics; print(get_num_asics())'` | ||
|
||
|
||
if (( $# == 0 && $NUM_ASICS == 1 )); then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there are sudo
statements in save_saidump_by_rdb
. When a user with only Read-only privileges executes show tech-support
these will fail?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The command "show techsupport" calls "sudo generate_dump -v -t 5".
By checking the below related commands' mode bits, we can see they are all have "other user executable bit".
And I run the command "show techsupport" in a user with only Read-only privileges, then get correct saidump results.
$ ls `which generate_dump` -l
-rwxr-xr-x 1 root root 66028 Sep 19 14:32 /usr/local/bin/generate_dump
$ ls `which saidump.sh` -l
-rwxr-xr-x 1 root root 1752 Sep 21 15:28 /usr/local/bin/saidump.sh
$ ls `which rm` -l
-rwxr-xr-x 1 root root 72704 Sep 24 2020 /usr/bin/rm
So, finally, remove all sudo for concise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@arlakshm , please review, thanks.
Install rdbtools into syncd docker and add saidump.sh into host. * [saidump] • Saidump for DNX-SAI sonic-net#13561 Solution and modification: To use the Redis-db SAVE option to save the snapshot of DB each time and recover later, instead of looping through each entry in the table and saving it. (1) Updated platform/broadcom/docker-syncd-brcm-dnx/Dockerfile.j2, install Python library rdbtools into the syncd containter. (2) Updated sonic-buildimage/src/sonic-sairedis/saidump/saidump.cpp, add a new option -r, which updates the rdbtools's output-JSON files' format. (3) Updated sonic-buildimage/build_debian.sh, to add a new script file: files/scripts/saidump.sh into the host. This shell file does the below steps: For each ASIC0, such as ASIC0, 1. Save the Redis data. sudo sonic-db-cli -n asic$1 SAVE > /dev/null 2. Move dump files to /var/run/redisX/ docker exec database$1 sh -c "mv /var/lib/redis/dump.rdb /var/run/redis$1/" 3. Run rdb command to convert the dump files into JSON files docker exec syncd$1 sh -c "rdb --command json /var/run/redis$1/dump.rdb | tee /var/run/redis$1/dump.json > /dev/null" 4. Run saidump -r to update the JSON files' format as same as the saidump before. Then we can get the saidump result in standard output. docker exec syncd$1 sh -c "saidump -r /var/run/redis$1/dump.json -m 100" 5. clear sudo rm -f /var/run/redis$1/dump.rdb sudo rm -f /var/run/redis$1/dump.json (4) Update sonic-buildimage/src/sonic-utilities/scripts/generate_dump, to check the asic db size and if it is larger than xxx entries, then do with REDIS SAVE, otherwise, to do with old method: looping through each entry of Redis DB.
…file and displays/format the right output (#1288) Why I did it Fix issue: sonic-net/sonic-buildimage#13561 The existing saidump use https://github.com/sonic-net/sonic-swss-common/blob/master/common/table_dump.lua script which loops the ASIC_DB more than 5 seconds and blocks other processes access. This solution uses the redis-db SAVE option to save the snapshot of DB each time and recover later, instead of looping through each entry in the table. Related PRs: sonic-net/sonic-utilities#2972 sonic-net/sonic-buildimage#16466
To add the rdbtools into base docker. Remove sudo in saidump.sh
To add the rdbtools into base docker. Move saidump.sh from host to syncd docker container.
@@ -0,0 +1,43 @@ | |||
#!/bin/bash | |||
set -e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you move this file to sairedis repo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. But the sairedis is using C++. And there are no shell scripts in it. Based on concerns about the source code consistency and conciseness, I think I need to re-write saidump.sh from shell script to C++ language, and extra time for development and review is needed. At the same time, the alternative method is to just move saidump.sh from the main repo into repo sairedis, and no extra time is needed. What's your opinion? Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there are script in syncd https://github.com/sonic-net/sonic-sairedis/tree/master/syncd/scripts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I will move saidump.sh into this folder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lguohan , @mlok-nokia , please review again. And a new PR was created as below.
sonic-net/sonic-sairedis#1298
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@judyjoseph , are you available to process this PR and sonic-net/sonic-sairedis#1298? Thanks.
To add the rdbtools into base docker. Move saidump.sh from host to syncd docker container.
…file and displays/format the right output (#1288) Why I did it Fix issue: sonic-net/sonic-buildimage#13561 The existing saidump use https://github.com/sonic-net/sonic-swss-common/blob/master/common/table_dump.lua script which loops the ASIC_DB more than 5 seconds and blocks other processes access. This solution uses the redis-db SAVE option to save the snapshot of DB each time and recover later, instead of looping through each entry in the table. Related PRs: sonic-net/sonic-utilities#2972 sonic-net/sonic-buildimage#16466
…rs. (sonic-net#16466) Fix sonic-net#13561 The existing saidump use https://github.com/sonic-net/sonic-swss-common/blob/master/common/table_dump.lua script which loops the ASIC_DB more than 5 seconds and blocks other processes access. This solution uses the Redis SAVE command to save the snapshot of DB each time and recover later, instead of looping through each entry in the table. Related PRs: sonic-net/sonic-utilities#2972 sonic-net/sonic-sairedis#1288 sonic-net/sonic-sairedis#1298 How did I do it? To use the Redis-db SAVE option to save the snapshot of DB each time and recover later, instead of looping through each entry in the table and saving it. 1. Updated dockers/docker-base-bullseye/Dockerfile.j2, install Python library rdbtools into the all the docker-base-bullseye containers. 2. Updated sonic-buildimage/src/sonic-sairedis/saidump/saidump.cpp, add a new option -r, which updates the rdbtools's output-JSON files' format. 3. To add a new script file: syncd/scripts/saidump.sh into the sairedis repo. This shell script does the following steps: For each ASIC, such as ASIC0, 3.1. Config Redis consistency directory. redis-cli -h $hostname -p $port CONFIG SET dir $redis_dir > /dev/null 3.2. Save the Redis data. redis-cli -h $hostname -p $port SAVE > /dev/null 3.3. Run rdb command to convert the dump files into JSON files rdb --command json $redis_dir/dump.rdb | tee $redis_dir/dump.json > /dev/null 3.4. Run saidump -r to update the JSON files' format as same as the saidump before. Then we can get the saidump's result in standard output." saidump -r $redis_dir/dump.json -m 100 3.5. Clear the temporary files. rm -f $redis_dir/dump.rdb rm -f $redis_dir/dump.json 4. Update sonic-buildimage/src/sonic-utilities/scripts/generate_dump. To check the asic db size and if it is larger than ROUTE_TAB_LIMIT_DIRECT_ITERATION (with default value 24000) entries, then do with REDIS SAVE, otherwise, to do with old method: looping through each entry of Redis DB. How to verify it On T2 setup with more than 96K routes, execute CLI command -- generate_dump No error should be shown Download the generate_dump result and verify the saidump file after unpacking it.
…rs. (sonic-net#16466) Fix sonic-net#13561 The existing saidump use https://github.com/sonic-net/sonic-swss-common/blob/master/common/table_dump.lua script which loops the ASIC_DB more than 5 seconds and blocks other processes access. This solution uses the Redis SAVE command to save the snapshot of DB each time and recover later, instead of looping through each entry in the table. Related PRs: sonic-net/sonic-utilities#2972 sonic-net/sonic-sairedis#1288 sonic-net/sonic-sairedis#1298 How did I do it? To use the Redis-db SAVE option to save the snapshot of DB each time and recover later, instead of looping through each entry in the table and saving it. 1. Updated dockers/docker-base-bullseye/Dockerfile.j2, install Python library rdbtools into the all the docker-base-bullseye containers. 2. Updated sonic-buildimage/src/sonic-sairedis/saidump/saidump.cpp, add a new option -r, which updates the rdbtools's output-JSON files' format. 3. To add a new script file: syncd/scripts/saidump.sh into the sairedis repo. This shell script does the following steps: For each ASIC, such as ASIC0, 3.1. Config Redis consistency directory. redis-cli -h $hostname -p $port CONFIG SET dir $redis_dir > /dev/null 3.2. Save the Redis data. redis-cli -h $hostname -p $port SAVE > /dev/null 3.3. Run rdb command to convert the dump files into JSON files rdb --command json $redis_dir/dump.rdb | tee $redis_dir/dump.json > /dev/null 3.4. Run saidump -r to update the JSON files' format as same as the saidump before. Then we can get the saidump's result in standard output." saidump -r $redis_dir/dump.json -m 100 3.5. Clear the temporary files. rm -f $redis_dir/dump.rdb rm -f $redis_dir/dump.json 4. Update sonic-buildimage/src/sonic-utilities/scripts/generate_dump. To check the asic db size and if it is larger than ROUTE_TAB_LIMIT_DIRECT_ITERATION (with default value 24000) entries, then do with REDIS SAVE, otherwise, to do with old method: looping through each entry of Redis DB. How to verify it On T2 setup with more than 96K routes, execute CLI command -- generate_dump No error should be shown Download the generate_dump result and verify the saidump file after unpacking it.
Cherry-pick PR to 202211: #17244 |
Cherry-pick PR to 202205: #17245 |
…rs. (#16466) Fix #13561 The existing saidump use https://github.com/sonic-net/sonic-swss-common/blob/master/common/table_dump.lua script which loops the ASIC_DB more than 5 seconds and blocks other processes access. This solution uses the Redis SAVE command to save the snapshot of DB each time and recover later, instead of looping through each entry in the table. Related PRs: sonic-net/sonic-utilities#2972 sonic-net/sonic-sairedis#1288 sonic-net/sonic-sairedis#1298 How did I do it? To use the Redis-db SAVE option to save the snapshot of DB each time and recover later, instead of looping through each entry in the table and saving it. 1. Updated dockers/docker-base-bullseye/Dockerfile.j2, install Python library rdbtools into the all the docker-base-bullseye containers. 2. Updated sonic-buildimage/src/sonic-sairedis/saidump/saidump.cpp, add a new option -r, which updates the rdbtools's output-JSON files' format. 3. To add a new script file: syncd/scripts/saidump.sh into the sairedis repo. This shell script does the following steps: For each ASIC, such as ASIC0, 3.1. Config Redis consistency directory. redis-cli -h $hostname -p $port CONFIG SET dir $redis_dir > /dev/null 3.2. Save the Redis data. redis-cli -h $hostname -p $port SAVE > /dev/null 3.3. Run rdb command to convert the dump files into JSON files rdb --command json $redis_dir/dump.rdb | tee $redis_dir/dump.json > /dev/null 3.4. Run saidump -r to update the JSON files' format as same as the saidump before. Then we can get the saidump's result in standard output." saidump -r $redis_dir/dump.json -m 100 3.5. Clear the temporary files. rm -f $redis_dir/dump.rdb rm -f $redis_dir/dump.json 4. Update sonic-buildimage/src/sonic-utilities/scripts/generate_dump. To check the asic db size and if it is larger than ROUTE_TAB_LIMIT_DIRECT_ITERATION (with default value 24000) entries, then do with REDIS SAVE, otherwise, to do with old method: looping through each entry of Redis DB. How to verify it On T2 setup with more than 96K routes, execute CLI command -- generate_dump No error should be shown Download the generate_dump result and verify the saidump file after unpacking it.
…rs. (#16466) Fix #13561 The existing saidump use https://github.com/sonic-net/sonic-swss-common/blob/master/common/table_dump.lua script which loops the ASIC_DB more than 5 seconds and blocks other processes access. This solution uses the Redis SAVE command to save the snapshot of DB each time and recover later, instead of looping through each entry in the table. Related PRs: sonic-net/sonic-utilities#2972 sonic-net/sonic-sairedis#1288 sonic-net/sonic-sairedis#1298 How did I do it? To use the Redis-db SAVE option to save the snapshot of DB each time and recover later, instead of looping through each entry in the table and saving it. 1. Updated dockers/docker-base-bullseye/Dockerfile.j2, install Python library rdbtools into the all the docker-base-bullseye containers. 2. Updated sonic-buildimage/src/sonic-sairedis/saidump/saidump.cpp, add a new option -r, which updates the rdbtools's output-JSON files' format. 3. To add a new script file: syncd/scripts/saidump.sh into the sairedis repo. This shell script does the following steps: For each ASIC, such as ASIC0, 3.1. Config Redis consistency directory. redis-cli -h $hostname -p $port CONFIG SET dir $redis_dir > /dev/null 3.2. Save the Redis data. redis-cli -h $hostname -p $port SAVE > /dev/null 3.3. Run rdb command to convert the dump files into JSON files rdb --command json $redis_dir/dump.rdb | tee $redis_dir/dump.json > /dev/null 3.4. Run saidump -r to update the JSON files' format as same as the saidump before. Then we can get the saidump's result in standard output." saidump -r $redis_dir/dump.json -m 100 3.5. Clear the temporary files. rm -f $redis_dir/dump.rdb rm -f $redis_dir/dump.json 4. Update sonic-buildimage/src/sonic-utilities/scripts/generate_dump. To check the asic db size and if it is larger than ROUTE_TAB_LIMIT_DIRECT_ITERATION (with default value 24000) entries, then do with REDIS SAVE, otherwise, to do with old method: looping through each entry of Redis DB. How to verify it On T2 setup with more than 96K routes, execute CLI command -- generate_dump No error should be shown Download the generate_dump result and verify the saidump file after unpacking it.
sonic-net#2972 added two below functions into scripts/generate_dump. get_route_table_size_by_asic_id_and_ipver save_saidump_by_route_size The unittest scripts need to be added. Related PRs: sonic-net#2972 sonic-net/sonic-buildimage#16466 sonic-net/sonic-sairedis#1288 sonic-net/sonic-sairedis#1298 Microsoft ADO (25892277): Add two scripts: tests/saidump_test.py tests/saidump_test.sh To use below 6 test cases to verify the functionality of get_route_table_size_by_asic_id_and_ipver and save_saidump_by_route_size behave correctly. ``` saidump test list format: [ACIS number, ipv4 and ipv6 route table size, expected function save_cmd arguments] saidump_test_list = [ [1, 10000, "docker exec syncd saidump saidump"], [1, 12000, "docker exec syncd saidump saidump"], [1, 12001, "docker exec syncd saidump.sh saidump"], [1, 20000, "docker exec syncd saidump.sh saidump"], [2, 10000, "docker exec syncd0 saidump saidump0\ndocker exec syncd1 saidump saidump1"], [2, 12000, "docker exec syncd0 saidump saidump0\ndocker exec syncd1 saidump saidump1"], [2, 12001, "docker exec syncd0 saidump.sh saidump0\ndocker exec syncd1 saidump.sh saidump1"], [2, 20000, "docker exec syncd0 saidump.sh saidump0\ndocker exec syncd1 saidump.sh saidump1"] ] ``` During the compiling stage, run the below command to check if it's PASSED. jumao@1b1ffba5949a:/sonic/src/sonic-utilities$ time python3 setup.py test tests/saidump_test.py::test_saidump PASSED
sonic-net#2972 added two below functions into scripts/generate_dump. get_route_table_size_by_asic_id_and_ipver save_saidump_by_route_size The unittest scripts need to be added. Related PRs: sonic-net#2972 sonic-net/sonic-buildimage#16466 sonic-net/sonic-sairedis#1288 sonic-net/sonic-sairedis#1298 Microsoft ADO (25892277): Add two scripts: tests/saidump_test.py tests/saidump_test.sh To use below 6 test cases to verify the functionality of get_route_table_size_by_asic_id_and_ipver and save_saidump_by_route_size behave correctly. ``` saidump test list format: [ACIS number, ipv4 and ipv6 route table size, expected function save_cmd arguments] saidump_test_list = [ [1, 10000, "docker exec syncd saidump saidump"], [1, 12000, "docker exec syncd saidump saidump"], [1, 12001, "docker exec syncd saidump.sh saidump"], [1, 20000, "docker exec syncd saidump.sh saidump"], [2, 10000, "docker exec syncd0 saidump saidump0\ndocker exec syncd1 saidump saidump1"], [2, 12000, "docker exec syncd0 saidump saidump0\ndocker exec syncd1 saidump saidump1"], [2, 12001, "docker exec syncd0 saidump.sh saidump0\ndocker exec syncd1 saidump.sh saidump1"], [2, 20000, "docker exec syncd0 saidump.sh saidump0\ndocker exec syncd1 saidump.sh saidump1"] ] ``` During the compiling stage, run the below command to check if it's PASSED. jumao@1b1ffba5949a:/sonic/src/sonic-utilities$ time python3 setup.py test tests/saidump_test.py::test_saidump PASSED
Why did I do it?
To fix the issue: #13561
The existing saidump use https://github.com/sonic-net/sonic-swss-common/blob/master/common/table_dump.lua script which loops the ASIC_DB more than 5 seconds and blocks other processes access.
This solution uses the Redis SAVE command to save the snapshot of DB each time and recover later, instead of looping through each entry in the table.
Related PRs:
sonic-net/sonic-utilities#2972
#16466
sonic-net/sonic-sairedis#1288
sonic-net/sonic-sairedis#1298
Work item tracking
Microsoft ADO (number only):
How did I do it?
To use the Redis-db SAVE option to save the snapshot of DB each time and recover later, instead of looping through each entry in the table and saving it.
How to verify it
Which release branch to backport (provide reason below if selected)
Tested branch (Please provide the tested image version)
Description for the changelog
Link to config_db schema for YANG module changes
A picture of a cute animal (not mandatory but encouraged)