Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid removing a VRF routing table when there are pending creation entries in gRouteBulker #3477

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

stephenxs
Copy link
Collaborator

@stephenxs stephenxs commented Jan 20, 2025

What I did

Avoid removing a VRF routing table when there are pending creation entries in gRouteBulker

  1. Remove a VRF routing table when a routing entry is removed only if there is no pending creation entry in gRouteBulker
  2. Avoid uninitialized value SAI IP address/prefix structure

Why I did it

Fix issue: out of range exception can be thrown in addRoutePost due to non exist VRF

(gdb) bt
#0  0x00007f5791aedebc in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007f5791a9efb2 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007f5791a89472 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#3  0x00007f5791de0919 in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#4  0x00007f5791debe1a in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#5  0x00007f5791debe85 in std::terminate() () from /lib/x86_64-linux-gnu/libstdc++.so.6
#6  0x00007f5791dec0d8 in __cxa_throw () from /lib/x86_64-linux-gnu/libstdc++.so.6
#7  0x00007f5791de3240 in std::__throw_out_of_range(char const*) () from /lib/x86_64-linux-gnu/libstdc++.so.6
#8  0x00005594e856d956 in std::map<unsigned long, std::map<swss::IpPrefix, RouteNhg, std::less<swss::IpPrefix>, std::allocator<std::pair<swss::IpPrefix const, RouteNhg> > >, std::less<unsigned long>, std::allocator<std::pair<unsigned long const, std::map<swss::IpPrefix, RouteNhg, std::less<swss::IpPrefix>, std::allocator<std::pair<swss::IpPrefix const, RouteNhg> > > > > >::at (this=<optimized out>, __k=<optimized out>) at /usr/include/c++/12/bits/stl_map.h:551
#9  0x00005594e8564beb in RouteOrch::addRoutePost (this=this@entry=0x5594ea13e080, ctx=..., nextHops=...) at ./orchagent/routeorch.cpp:2145
#10 0x00005594e856b0b2 in RouteOrch::doTask (this=0x5594ea13e080, consumer=...) at ./orchagent/routeorch.cpp:1021
#11 0x00005594e85282d2 in Orch::doTask (this=0x5594ea13e080) at ./orchagent/orch.cpp:553
#12 0x00005594e851909a in OrchDaemon::start (this=this@entry=0x5594ea0a0950) at ./orchagent/orchdaemon.cpp:895
#13 0x00005594e8485632 in main (argc=<optimized out>, argv=<optimized out>) at ./orchagent/main.cpp:818

How I verified it

Unit (mock) test

Details if related

Originally, it cleaned up a VRF routing table whenever a prefix of the VRF was removed if

  1. there was no routing entry in the VRF routing table and
  2. the prefix was not pending creation in gRouteBulker

The motivation is to remove a VRF routing table if there is no routing entry in the VRF and no routing entry pending creation for that VRF. However, condition 2 does not guarantee that.

The ideal way of the 2nd condition is to check pending creation entries of a certain VRF, which we can not do.
So, we are using strict conditions here as the following:

  1. there is no routing entry in the VRF routing table and
  2. there is no pending creating routing entry in gRouteBulker regardless of which VRF it belongs to

@mssonicbld
Copy link
Collaborator

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld
Copy link
Collaborator

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld
Copy link
Collaborator

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@stephenxs stephenxs changed the title Set bytes to 0 before copying IPv4 addresses/masks to avoid uninitialized bytes Erase VRF routing table only if there is no pending creation entries in gRouteBulker Jan 21, 2025
@stephenxs stephenxs changed the title Erase VRF routing table only if there is no pending creation entries in gRouteBulker Avoid removing a VRF routing table when there are pending creation entries in gRouteBulker Jan 22, 2025
@stephenxs stephenxs force-pushed the fix-uninitialized-ipv4-bytes branch from 5a7da1f to 727bb0e Compare January 22, 2025 05:19
@mssonicbld
Copy link
Collaborator

/azp run

@mssonicbld
Copy link
Collaborator

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@abdosi
Copy link
Contributor

abdosi commented Jan 22, 2025

thanks @stephenxs

@stephenxs stephenxs marked this pull request as ready for review January 23, 2025 07:08
@stephenxs stephenxs requested a review from prsunny as a code owner January 23, 2025 07:08
…ized bytes

Signed-off-by: Stephen Sun <stephens@nvidia.com>
Signed-off-by: Stephen Sun <stephens@nvidia.com>
Signed-off-by: Stephen Sun <stephens@nvidia.com>
Signed-off-by: Stephen Sun <stephens@nvidia.com>
Signed-off-by: Stephen Sun <stephens@nvidia.com>
@stephenxs stephenxs force-pushed the fix-uninitialized-ipv4-bytes branch from 3b84c79 to 5ffa1f3 Compare January 23, 2025 07:08
@mssonicbld
Copy link
Collaborator

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@stephenxs
Copy link
Collaborator Author

Didn't find a failure but vstest returned non zero

@stephenxs
Copy link
Collaborator Author

/azpw run

@mssonicbld
Copy link
Collaborator

/AzurePipelines run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@stephenxs
Copy link
Collaborator Author

Almost all failures in Test vstest pipeline are also observed on other swss and sairedis PR checks but no failure is relevant to the PRs themself.
@abdosi @prsunny would you help to check the failures? thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants