-
Notifications
You must be signed in to change notification settings - Fork 775
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[21293] Fix destruction data-race on participant removal in intra-process (backport #5034) #5367
Open
mergify
wants to merge
4
commits into
2.14.x
Choose a base branch
from
mergify/bp/2.14.x/pr-5034
base: 2.14.x
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
) * Refs #21293: Add BB test Signed-off-by: Mario Dominguez <mariodominguez@eprosima.com> * Refs #21293: Reinforce test to fail more frequently Signed-off-by: Mario Dominguez <mariodominguez@eprosima.com> * Refs #21293: Add RefCountedPointer.hpp to utils Signed-off-by: Mario Dominguez <mariodominguez@eprosima.com> * Refs #21293: Add unittests for RefCountedPointer Signed-off-by: Mario Dominguez <mariodominguez@eprosima.com> * Refs #21293: LocalReaderPointer.hpp Signed-off-by: Mario Dominguez <mariodominguez@eprosima.com> * Refs #21293: BaseReader aggregates LocalReaderPointer Signed-off-by: Mario Dominguez <mariodominguez@eprosima.com> * Refs #21293: ReaderLocator aggregates LocalReaderPointer Signed-off-by: Mario Dominguez <mariodominguez@eprosima.com> * Refs #21293: RTPSDomainImpl::find_local_reader returns a sared_ptr<LocalReaderPointer> and properly calls local_actions_on_reader_removed() Signed-off-by: Mario Dominguez <mariodominguez@eprosima.com> * Refs #21293: RTPSWriters properly using LocalReaderPointer::Instance when accessing local reader Signed-off-by: Mario Dominguez <mariodominguez@eprosima.com> * Refs #21293: Linter Signed-off-by: Mario Dominguez <mariodominguez@eprosima.com> * Refs #21293: Fix windows warnings Signed-off-by: Mario Dominguez <mariodominguez@eprosima.com> * Refs #21293: Address Miguel's review Signed-off-by: Mario Dominguez <mariodominguez@eprosima.com> * Refs #21293: Apply last comment Signed-off-by: Mario Dominguez <mariodominguez@eprosima.com> * Refs #21293: NIT Signed-off-by: Mario Dominguez <mariodominguez@eprosima.com> --------- Signed-off-by: Mario Dominguez <mariodominguez@eprosima.com> (cherry picked from commit 456e45f) # Conflicts: # include/fastdds/rtps/writer/ReaderLocator.h # include/fastdds/rtps/writer/ReaderProxy.h # src/cpp/rtps/RTPSDomain.cpp # src/cpp/rtps/RTPSDomainImpl.hpp # src/cpp/rtps/participant/RTPSParticipantImpl.cpp # src/cpp/rtps/participant/RTPSParticipantImpl.h # src/cpp/rtps/reader/BaseReader.cpp # src/cpp/rtps/reader/BaseReader.hpp # src/cpp/rtps/writer/ReaderLocator.cpp # src/cpp/rtps/writer/StatefulWriter.cpp # src/cpp/rtps/writer/StatelessWriter.cpp # test/blackbox/common/DDSBlackboxTestsBasic.cpp # test/mock/rtps/ReaderLocator/fastdds/rtps/writer/ReaderLocator.h # test/unittest/utils/CMakeLists.txt
Cherry-pick of 456e45f has failed:
To fix up this pull request, you can check it out locally. See documentation: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/checking-out-pull-requests-locally |
13 tasks
Signed-off-by: Mario Dominguez <mariodominguez@eprosima.com>
…rPointer<> and add DOXYGEN_SHOULD_SKIP_THIS_PUBLIC Signed-off-by: Mario Dominguez <mariodominguez@eprosima.com>
MiguelCompany
requested changes
Nov 15, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one stupid NIT
Mario-DL
requested review from
MiguelCompany
and removed request for
MiguelCompany
November 15, 2024 10:39
MiguelCompany
approved these changes
Nov 15, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
This PR addresses a race issue happening in stressed intraprocess scenarios when
EDP's
writer intends to use the remote local reader pointer of an already removed participant. This happens because the participant hasn't received the other's one disposal yet (as it goes through transport).Some ci flaky tests have already been identified to be related with this issue.
The proposed solution introduces a new state in the Readers
LocalReaderViewStatus
in which the reader will notify that it is inactive as soon as it is destroyed and noone is using it.On the other side, the remote local writers using pointers to it, now holds a
LocalReaderPointer
which wraps the raw reader's pointer plus the view. An internal counter now accounts for the number of references.Thanks @MiguelCompany for helping with the final's solution design.
Note: the test may be launched with
--restest-until-fail 20
or so, in order to reproduce the issue. For a more frequent failure, review can launch thecolcon test
with thetaskset -c 0,1
prefix to make the test to stress more and make it fail more frequently.@Mergifyio backport 3.1.x 3.0.x 2.14.x 2.10.x
Contributor Checklist
Commit messages follow the project guidelines.
The code follows the style guidelines of this project.
Tests that thoroughly check the new feature have been added/Regression tests checking the bug and its fix have been added; the added tests pass locally
Any new/modified methods have been properly documented using Doxygen.
N/A Any new configuration API has an equivalent XML API (with the corresponding XSD extension)
Changes are backport compatible: they do NOT break ABI nor change library core behavior.
Changes are API compatible.
N/A New feature has been added to the
versions.md
file (if applicable).N/A New feature has been documented/Current behavior is correctly described in the documentation.
Applicable backports have been included in the description.
Reviewer Checklist
This is an automatic backport of pull request #5034 done by [Mergify](https://mergify.com).