-
-
Notifications
You must be signed in to change notification settings - Fork 514
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(reaper): refactor to allow retries and fix races #2728
fix(reaper): refactor to allow retries and fix races #2728
Conversation
✅ Deploy Preview for testcontainers-go ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
Depends on #2738 |
0dc52af
to
0b62224
Compare
d3c9c9a
to
0589878
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a commit on top to satisfy the usage of several missing functions (see comments).
Other than than, I merged the branch with current main and resolved conflicts.
@mdelapenya reminder this is dependent on #2728 which is why it has missing components still. I'll rebase once #2738 is merged. |
e7b948d
to
5de803c
Compare
Rebased and picking up again, hopefully wont require many fixes. Will convert from draft when ready. |
bf961f0
to
7649e0c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only added a few super minor comments, other than that, my only concern is about the implications of these changes with the reuse mode we are proposing in #2768
I'm fine merging this, although I'd like to discuss with you about reuse more in depth, as it could apply to the reaper code as well.
7649e0c
to
4182991
Compare
@stevenh is there anything I can do to move forward with this PR? |
4182991
to
7bc2a87
Compare
Blocked on #2786 |
7bc2a87
to
12c4483
Compare
a304b09
to
1f4141c
Compare
1f4141c
to
0e34b71
Compare
There are failing tests for the reaper. Could you take a look? |
Thanks for the poke I was expecting some issues here, but hadn't got round to checking yet. |
0e34b71
to
4218dcc
Compare
Any progress on this? We're having issues with flaky tests and would love for this to get merged. |
ryuk side was merged last week, just fixing a few compatibility issue with tc-java so once that's done, this is next on the list @erikmansson |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few comments, but LGTM.
Once addressed, specially those regarding the t.Skip when the reaper is disabled on GH actions, we can move on and merge it.
Thanks as always for your hard work with the library 🙇
4218dcc
to
6a96b24
Compare
Refactor how the reaper is created to allow for proper retries of temporary errors such as container not found issues during startup or shutdown races. This eliminates the use of sync.Once which wasn't solving the problem at hand and replaces it with a singleton spawner with locked access. Wrap reaper errors so we can determine the cause of failures more easily. Fix race condition in port wait when loading from container by always waiting for the port first. Remove unnecessary use of buffering and invalid retry logic in reaper connection handling which could never recover correctly from a partial read. Move the reaper creation just before connections are established in compose to ensure its still running when the Connect calls are made. Previously the reaper was requested in NewDockerComposeWith which means it could have already shutdown before connections are made during the later sections of Up if the startup took over 1 minute. This was causing consistent failures for: TestDockerComposeAPIWithWaitLogStrategy Ensure that resource labels are correct so that resources aren't reaped when the reaper is disabled by excluding session id when reaper is disabled. Error when creating a reaper when the config says it's disabled so that we avoid hard to debug issues because a reaper is running when it shouldn't be. Set org.testcontainers.reap label for containers which should be reaped by the reaper, to prevent containers which disable the reaper from being incorrectly reaped.
6a96b24
to
8bfe616
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
I'd do the refactor of the compose tests in a follow-up, so we get this PR in, and we can start a release.
* main: fix(reaper): refactor to allow retries and fix races (testcontainers#2728) chore: update ryuk to 0.10.2 (testcontainers#2833)
* main: fix(reaper): refactor to allow retries and fix races (testcontainers#2728) chore: update ryuk to 0.10.2 (testcontainers#2833) feat: add yugabytedb module (testcontainers#2825) fix: update module container struct name and missing imports (testcontainers#2831) chore: replace 'assert' with 'require' (testcontainers#2827) chore: replace 'assert' with 'require' for critical checks (testcontainers#2824) chore: bump ryuk to latest release (testcontainers#2818) feat: add require for critical checks (testcontainers#2812)
Refactor how the reaper is created to allow for proper retries of temporary errors such as container not found issues during startup or shutdown races.
This eliminates the use of sync.Once which wasn't solving the problem at hand and replaces it with a singleton spawner with locked access.
Wrap reaper errors so we can determine the cause of failures more easily.
Fix race condition in port wait when loading from container by always waiting for the port first.
Remove unnecessary use of buffering and invalid retry logic in reaper connection handling which could never recover correctly from a partial read.
Move the reaper creation just before connections are established in compose to ensure its still running when the Connect calls are made.
Previously the reaper was requested in NewDockerComposeWith which means it could have already shutdown before connections are made during the later sections of Up if the startup took over 1 minute.
This was causing consistent failures for:
TestDockerComposeAPIWithWaitLogStrategy
Ensure that resource labels are correct so that resources aren't reaped when the reaper is disabled by excluding session id when reaper is disabled.
Error when creating a reaper when the config says it's disabled so that we avoid hard to debug issues because a reaper is running when it shouldn't be.