-
Notifications
You must be signed in to change notification settings - Fork 288
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kafka(ticdc): sarama do not retry if produce message failed to prevent out of order (#11870) #11962
base: release-8.1
Are you sure you want to change the base?
kafka(ticdc): sarama do not retry if produce message failed to prevent out of order (#11870) #11962
Conversation
This cherry pick PR is for a release branch and has not yet been approved by triage owners. To merge this cherry pick:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files
Flags with carried forward coverage won't be shown. Click here to find out more. @@ Coverage Diff @@
## release-8.1 #11962 +/- ##
================================================
Coverage ? 57.4081%
================================================
Files ? 854
Lines ? 126111
Branches ? 0
================================================
Hits ? 72398
Misses ? 48308
Partials ? 5405 |
This is an automated cherry-pick of #11870
What problem does this PR solve?
Issue Number: close #11935
What is changed and how it works?
config.Net.MaxOpenRequest
is set to 1config.Producer.Retry.Max
is set to 0, to disable the internal retry mechanismThe root cause of the out-of-order message problem comes from the sarama internal bug, cannot be easily fixed, this is a workaround solution, by set the
retry.max
to 0, to disable the retry.Check List
Tests
This is tested by an internal E2E test, which inject network partition between the random cdc node and random kafka server. Before this PR, the test case cannot be passed, and we found out-of-order message by reading consumer log, after this PR it can be passed, and no out-of-order message.
Questions
Will it cause performance regression or break compatibility?
Do you need to update user documentation, design documentation or monitoring documentation?
Release note