Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change MB expire strategy #39

Merged
merged 11 commits into from
Oct 10, 2024
Merged

Change MB expire strategy #39

merged 11 commits into from
Oct 10, 2024

Conversation

toshipp
Copy link
Contributor

@toshipp toshipp commented Sep 18, 2024

We designed MBC to handle the expiration of backups, but it is not good because MBC does not delete secondary MB if the network connection is disconnected. To fix this, we add an expire field to MB.

@toshipp toshipp changed the title Change expire strategy Change MB expire strategy Sep 18, 2024
@toshipp toshipp force-pushed the mb-expire branch 3 times, most recently from e50469b to 37b8a40 Compare September 18, 2024 10:38
@toshipp toshipp force-pushed the mb-expire branch 4 times, most recently from 8a470fa to 39e6efe Compare October 4, 2024 06:30
@toshipp toshipp marked this pull request as ready for review October 4, 2024 08:14
Expect(err).NotTo(HaveOccurred())

backup, err := resMgr.CreateUniqueBackupFor(ctx, pvc, func(backup *mantlev1.MantleBackup) {
backup.Spec.Expire = "1d"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not necessary to set the last arg because this function set the default value of Expire field.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered explicitly specifying the value makes the test more stable. If someone changes the default to 2d in testutil, this test will fail. But adding some duration to the previous value is more readable, IMO, I'll update this.

internal/ceph/ceph.go Show resolved Hide resolved

var ErrSnapshotNotFound = errors.New("snapshot not found")

func IsRBDSnapshotNamed(snap string) func(RBDSnapshot) bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about renaming snap to name? For instance, I believe s.Name == name is easier to understand than s.Name == snap.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

internal/ceph/util.go Show resolved Hide resolved
return fmt.Errorf("snap already exists: %s@%s", key, snap)
}

f.lastSnapId++
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code implies we can't have a snapshot that id is 0 although the id of the first snapshot is 0 in Ceph. How about renaming lastSnapId to nextSnapId and moving this code after L57?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will do


By("checking removing the same snapshot fails")
err = f.RBDSnapRm("pool", "image", "snap1")
Expect(err).To(HaveOccurred())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also check RBDSnapLs returns only snap2 here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will do

@satoru-takeuchi
Copy link
Contributor

@toshipp Although I'm in the middle of review, I left some comments about the first commit to "CephCmd in mantlebackup_controller".

@satoru-takeuchi
Copy link
Contributor

@toshipp Please resolve conflicts caused by merging the last PR.

@toshipp
Copy link
Contributor Author

toshipp commented Oct 10, 2024

@satoru-takeuchi
I'm thinking of rebasing it after the review is done. There are many conflicts, I think it is hard to review this if rebasing is done during the review.

Copy link
Contributor

@satoru-takeuchi satoru-takeuchi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Please rebase it to main.

We designed MantleBackupConfig expires MantleBackup, but it was not good
because replicated MantleBackups on the secondary will not be able to
expired if the primary and the secondary are disconnected.  To fix this,
we introduce Expire on the MB itself.

Signed-off-by: Toshikuni Fukaya <toshikuni-fukaya@cybozu.co.jp>
The expiration duration can now be given in the MB spec, set it
according to MBC spec.

Signed-off-by: Toshikuni Fukaya <toshikuni-fukaya@cybozu.co.jp>
Signed-off-by: Toshikuni Fukaya <toshikuni-fukaya@cybozu.co.jp>
MantleBackup can be created manually, so it's not necessary to inherit
validation from MantleBackupConfig. However, there's no reason to change
it, so we'll use the same validation.

Signed-off-by: Toshikuni Fukaya <toshikuni-fukaya@cybozu.co.jp>
It intends to replace executeCommand in mantlebackup_controller.

Signed-off-by: Toshikuni Fukaya <toshikuni-fukaya@cybozu.co.jp>
The previous code simulates the executeCommand function and sets static
timestamps for RBD snapshots. This approach poses a problem for
implementing expiration, as all MantleBackup resources would expire
during testing if the timestamps become outdated. To address this, we
need to create a more realistic fake CephCmd and use it.

Signed-off-by: Toshikuni Fukaya <toshikuni-fukaya@cybozu.co.jp>
The expiration is done by reconciler if already expired. Otherwise
schedule expiration by requesting GenericEvents instead of using requeue
provided by controller-runtime. It is because we don't want to exit the
reconciler to do other tasks.

Signed-off-by: Toshikuni Fukaya <toshikuni-fukaya@cybozu.co.jp>
It should be also implemented in MB reconciler as well as expiration.

Signed-off-by: Toshikuni Fukaya <toshikuni-fukaya@cybozu.co.jp>
The expiration is done by MB reconciler, delete the function from the
sub command. Now the command name becomes backup.

Signed-off-by: Toshikuni Fukaya <toshikuni-fukaya@cybozu.co.jp>
Signed-off-by: Toshikuni Fukaya <toshikuni-fukaya@cybozu.co.jp>
The struct is only used on the test and utilities using it can be
replaced with the ceph package.

Signed-off-by: Toshikuni Fukaya <toshikuni-fukaya@cybozu.co.jp>
@toshipp
Copy link
Contributor Author

toshipp commented Oct 10, 2024

@satoru-takeuchi
Rebased. Please check if I did something wrong.

@satoru-takeuchi satoru-takeuchi merged commit c29344c into main Oct 10, 2024
3 checks passed
@satoru-takeuchi satoru-takeuchi deleted the mb-expire branch October 10, 2024 23:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants