-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix xtrabackup
context when doing AddFiles
#16806
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>
Review ChecklistHello reviewers! 👋 Please follow this checklist when reviewing this Pull Request. General
Tests
Documentation
New flags
If a workflow is added or modified:
Backward compatibility
|
xtrabackup
context when doing AddFiles
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #16806 +/- ##
==========================================
- Coverage 69.53% 69.52% -0.02%
==========================================
Files 1567 1567
Lines 202388 202390 +2
==========================================
- Hits 140723 140704 -19
- Misses 61665 61686 +21 ☔ View full report in Codecov by Sentry. |
@frouioui with your PR I can no longer listing backup for example: /vt/bin/vtctldclient --server=vtctld:15999 --logtostderr=true GetBackups unsharded/-
E0919 12:12:39.588828 27 main.go:56] rpc error: code = Unknown desc = operation error S3: HeadBucket, https response error StatusCode: 301, RequestID: 2ACET95HY572K3PX, HostID: X2nVlauIKbd8pM8PqINAZCTtrmJ7LmHkCHVrCZq9tTbR3cLb8s4AnS/SnrELiZzroPbO69weCZ4=, api error MovedPermanently: Moved Permanently |
Interesting, may I know what flags (redacted) you are using? We recently changed the SDK version, that might be the cause. |
@frouioui |
Description
This PR fixes a long standing issue in the xtrabackup engine: the context used to
AddFiles
by the backup storage was canceled beforeAddFiles
had the chance to complete. This problem can be experienced with the S3 backup storage, as it uploads the file through a goroutine, Ceph also does that and have been impacted in the past too. That goroutine wouldn't have time to complete before theXtrabackupEngine.backupFiles
function returned and canceled the context through adefer
statement. Leading toAddFiles
failing since the context we pass got canceled.To avoid this issue, we never really passed the context down to
AddFiles
. We attempted to pass the context in #12500 but reverted it in #14311 since #14188 was raised.This PR fixes that by not canceling the context automatically, but only when the
backupFiles
return an error. If we have an error anywhere while executingbackupFiles
we should attempt to cancel the addition of the files, otherwise we can assume there is nothing to cancel. A different goroutine already make sure to cancel the context if we reach a certain timeout when closing the files.cc @L3o-pold @vczyh as the original authors of the two issues. I have tested this on my end with manual tests (only on S3), but do you mind making sure it fixes your issues?
Related Issue(s)
Checklist