Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[action] [PR:15628] Fix dualtor t0 mock orch crash #16037

Merged
merged 1 commit into from
Dec 19, 2024

Conversation

mssonicbld
Copy link
Collaborator

Description of PR

Summary:
Fixes dualtor tests when ran with mocked setup on t0 topologies.

This PR fixes 2 issues:

  1. Fix orchagent crash when setting up dualtor mock environment on a t0 topo with Broadcom platform.
    When running dualtor tests on t0 topo, the DUT has to enter a mocked dualtor state. Part of this setup is adding a tunnel table to CONFIG_DB, which involves a Broadcom SAI attribute that is only supported when sai_tunnel_support=1 is set in syncd:/etc/sai.d/config.bcm - this attribute is not set until apply_peer_switch_table_to_dut() is run.
    Changing an unsupported Broadcom SAI attribute will cause orchagent to crash.

  2. Fix dualtor mocked tests overwriting /etc/sonic/config_db.json
    When running dualtor tests on a t0 topology, the test will overwrite /etc/sonic/config_db.json during the test, causing config_reload() at the end of the test to not restore the pre-test state of CONFIG_DB.

Fixes # (issue)

Type of change

  • Bug fix
  • Testbed and Framework(new/improvement)
  • Test case(new/improvement)

Back port request

  • 202012
  • 202205
  • 202305
  • 202311
  • 202405

Approach

What is the motivation for this PR?

How did you do it?

  1. Fix orchagent crash by first running the setup function apply_peer_switch_table_to_dut() that will set sai_tunnel_support=1, before adding the tunnel table with apply_tunnel_table_to_dut().

  2. Fix config mismatch by adding a fixture to backup /etc/sonic/config_db.json before the test, then restore and config reload -y it after the test. Change config reload source config file to running_golden_config instead of minigraph.

How did you verify/test it?

  1. orchagent no longer crashes.

  2. config_db.json is reverted after tests are run in each test file. However, config_check may still fail between test cases as some files only apply the fixtures once per test file instead of once per test case. This is okay as config_check fail will only cause a warning log. Configs are restored properly now.

Any platform specific information?

orchagent crash seems to only occur to Broadcom platform on Arista hwSkus.

Supported testbed topology if it's a new test case?

Documentation

@mssonicbld
Copy link
Collaborator Author

/azp run

Copy link

Azure Pipelines could not run because the pipeline triggers exclude this branch/path.

@mssonicbld
Copy link
Collaborator Author

/azp run

Copy link

Azure Pipelines could not run because the pipeline triggers exclude this branch/path.

* Fix orchagent crash when setting up mock dualtor environment for t0

When running dualtor tests on t0 topo, the DUT has to enter a mocked
dualtor state. Part of this setup is adding a tunnel table to CONFIG_DB,
which involves a Broadcom SAI attribute that is only supported when
`sai_tunnel_support=1` is set in `syncd:/etc/sai.d/config.bcm` - this
attribute is not set until `apply_peer_switch_table_to_dut()` is run.

Changing an unsupported Broadcom SAI attribute will cause orchagent to crash.

Fix this issue by first running the setup function
`apply_peer_switch_table_to_dut()` that will set
`sai_tunnel_support=1`, before adding the tunnel table with
`apply_tunnel_table_to_dut()`.

* Fix dualtor tests overwriting /etc/sonic/config_db.json

When running dualtor tests on a t0 topology, the test will overwrite
`/etc/sonic/config_db.json` during the test, causing `config_reload()`
at the end of the test to not restore the pre-test state of CONFIG_DB.

Fix by adding a fixture to backup `/etc/sonic/config_db.json` before
the test, then restore and `config reload -y` it after the test.

* Change to reuse old config restore and use running_golden_config source
@mssonicbld
Copy link
Collaborator Author

/azp run

@mssonicbld
Copy link
Collaborator Author

Original PR: #15628

Copy link

Azure Pipelines could not run because the pipeline triggers exclude this branch/path.

@mssonicbld mssonicbld merged commit c425869 into sonic-net:202411 Dec 19, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants