Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(docdb): support snapshot removal policy #28798

Merged
merged 6 commits into from
Feb 29, 2024
Merged

Conversation

lpizzinidev
Copy link
Contributor

@lpizzinidev lpizzinidev commented Jan 21, 2024

Adds support for removalPolicy: RemovalPolicy.SNAPSHOT for DocumentDB clusters as specified in the documentation.

To allow users to specify custom policies for the cluster's instances and security group the following properties have been added:

  • instanceRemovalPolicy
  • securityGroupRemovalPolicy

Closes #28773.
Closes #28861


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license

@aws-cdk-automation aws-cdk-automation requested a review from a team January 21, 2024 11:35
@github-actions github-actions bot added bug This issue is a bug. effort/medium Medium work item – several days of effort p2 distinguished-contributor [Pilot] contributed 50+ PRs to the CDK labels Jan 21, 2024
Copy link
Collaborator

@aws-cdk-automation aws-cdk-automation left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pull request linter has failed. See the aws-cdk-automation comment below for failure reasons. If you believe this pull request should receive an exemption, please comment and provide a justification.

A comment requesting an exemption should contain the text Exemption Request. Additionally, if clarification is needed add Clarification Request to a comment.

@lpizzinidev
Copy link
Contributor Author

Exemption Request.
The feature is currently not documented.
Let me know if we should add some information about it.

@aws-cdk-automation aws-cdk-automation added pr-linter/exemption-requested The contributor has requested an exemption to the PR Linter feedback. pr/needs-community-review This PR needs a review from a Trusted Community Member or Core Team Member. labels Jan 21, 2024
@kornicameister
Copy link
Contributor

@lpizzinidev but is it a good idea to silently fallback to another removal policy here? Personally I find it better to know that something is not supported instead of learning sometime later in production that there is no snapshot although my infrastructure code says there should be.

@lpizzinidev
Copy link
Contributor Author

@kornicameister
From my understanding, the snapshot removal policy is supported for AWS::DocDB::DBCluster.
The fallback to delete is used on AWS::DocDB::DBInstance and AWS::EC2::SecurityGroup resources since they don't support a snapshot removal policy.
The cluster's snapshot is created.
I think this is a valid approach for handling this feature, please let me know if you have suggestions for changing the current implementation.

@kornicameister
Copy link
Contributor

@lpizzinidev suggestion remains.
I would maintain creating security group with defaults which excludes setting up any retention policies inside of a document DB construct.

If user wants to retain w security group it may create one before creating a cluster or an instance. That way, at least, caller is aware of a setting instead of it being magically transformed between incompatible resources.

What it user wishes to remove security group but retain cluster. What is there is a need to remove cluster but retain a group. All in all there's more than one combination and I find it better to let caller decide.

Finally... isn't what you are proposing, potentially dangerous, precedence? I know CDK comes with reasonable defaults but, at least to my knowledge, dealing with security group is always exposed for caller to customize. It is like that here but there is this one extra bit that requires extra documentation.

Are there more places hacking their way through escape hatches like that? Or is it really a precedence?

@kornicameister
Copy link
Contributor

kornicameister commented Jan 23, 2024

@lpizzinidev your link leads to RDS.

Also I tried setting up snapshot policy for cluster but cdk synth failed for me with unsupported policy error.

Error: AWS::DocDB::DBCluster does not support snapshot removal policy

@lpizzinidev
Copy link
Contributor Author

lpizzinidev commented Jan 24, 2024

@kornicameister
If you scroll down you can see the full list of resources supporting a snapshot removal policy.
If you synth using the current CDK version it won't work as the list of supported resources is outdated.
We can discuss what's the best fallback option for the resources not supporting a snapshot removal policy.
My approach is based on the idea that the user probably doesn't want to retain resources when specifying a snapshot policy, so I simply fall back to delete when a snapshot is not possible.

@kornicameister
Copy link
Contributor

kornicameister commented Jan 24, 2024

@lpizzinidev I rest my case about it being unsupported...my bad. However that begs a question where is this list and how we can update it. That's a bit of a separate issue I am happy to report.

Nonetheless I still do not agree with this fallback. It still sounds a bit like hidden hack here.

Perhaps it's time to get other people here to express opinion because I am just one guy and it feels wrong just for me which is hardly an argument if general attitude in cdk is to let those hacks happen. In that case I have to live with it 😝.

Long story short such discussion is wonderful idea but I would argue that it should start with general RFC here and handle things globally in sort of unified manner instead of dealing with it here and leaving rest of CDK untouched. Given that is what is happening, because I am not sure of it. Once again a greater number of contributors must express what they think of it.

We can play ping pong all day long...but where does it take us?

@aws-cdk-automation aws-cdk-automation dismissed their stale review January 25, 2024 06:31

✅ Updated pull request passes all PRLinter validations. Dismissing previous PRLinter review.

@lpizzinidev
Copy link
Contributor Author

@kornicameister
Yep, it's up to the core team to discuss the intended behavior, mine is just a proposal.
Anyway, I've updated the documentation as this might be confusing for users as well.
Thanks for following up with your suggestions.

Copy link
Contributor

@GavinZZ GavinZZ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am partly agreeing with this change.

To be fair, I agree with the security group change for the following reason:

  1. The default removal policy for security group isDELETE. It makes sense to me to fall back to the default policy.
  2. Users wasn't able to use SNAPSHOT policy on DocDB cluster resource prior to this fix, so it won't cause breaking change.
  3. Users can always provide their own security group if they want other removal policy (and we should add an example to readme to set a custom deletion policy on security group with SNAPSHOT policy on docdb cluster).

However, I'm a bit hesitant on the DocDB instance part. I appreciate the fact that this PR now adds support for SNAPSHOT deletion policy for cluster. Correct me if wrong, but there's no way to specify RETAIN instance while using SNAPSHOT on cluster.

Edit

but there's no way to specify RETAIN instance while using SNAPSHOT on cluster.

I guess this is not entirely true, as there's a pretty ugly workaround like the following

cluster.node.children.forEach(child => {
  if (child instanceof docdb.CfnDBInstance) {
    child.applyRemovalPolicy(cdk.RemovalPolicy.RETAIN, { applyToUpdateReplacePolicy: true } as cdk.RemovalPolicyOptions);
  }
});

I personally don't like this workaround but I also can't think of a better way other than creating another property (which I hate even more). I'll bring it up for team discussion for a second thought.

@lpizzinidev
Copy link
Contributor Author

@GavinZZ

Correct me if wrong, but there's no way to specify RETAIN instance while using SNAPSHOT on cluster.

That's correct.
However, I think that if we want to allow users to specify a different policy when the cluster is the to SNAPSHOT, then I think the clearer approach is just providing a separate prop for that.
What if there are users who need to apply the DELETE policy in that case?

@kornicameister
Copy link
Contributor

On there other hand.
I haven't checked that but assuming one can restore from snapshot. Should we either handle this here or create separate issue?

@kornicameister
Copy link
Contributor

kornicameister commented Feb 15, 2024

@lpizzinidev @GavinZZ ; friendly ping here.
I am quite interested in being able to do restoration from snapshot, so bumping up a question.

Based on CloudFormation: https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-docdb-dbcluster.html#cfn-docdb-dbcluster-restoretotime we do have quite a few options to consider.

@GavinZZ
Copy link
Contributor

GavinZZ commented Feb 22, 2024

@kornicameister I would think that as a separate issue and PR.

@lpizzinidev I do agree with you that adding a separate props for the removal policy is the best way to go here. Would you be available to work on it? If not I can help pushing commits to this PR.

@kornicameister
Copy link
Contributor

@GavinZZ later in the evening I will submit another ticket for passing sourceSnapshot into cluster to make restoration possible.

@lpizzinidev
Copy link
Contributor Author

@GavinZZ
Sure, I'll work on it during the weekend 👍

@aws-cdk-automation aws-cdk-automation added pr/needs-maintainer-review This PR needs a review from a Core Team Member and removed pr/needs-community-review This PR needs a review from a Trusted Community Member or Core Team Member. labels Feb 24, 2024
@kornicameister
Copy link
Contributor

@lpizzinidev @GavinZZ issue is already created by someone else: #27188.

Copy link
Contributor

@GavinZZ GavinZZ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love it, thanks for making the changes. Just two minor comments on test cases.

GavinZZ
GavinZZ previously approved these changes Feb 29, 2024
Copy link
Contributor

@GavinZZ GavinZZ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks for contributing!

Copy link
Contributor

mergify bot commented Feb 29, 2024

Thank you for contributing! Your pull request will be updated from main and then merged automatically (do not update manually, and be sure to allow changes to be pushed to your fork).

@aws-cdk-automation aws-cdk-automation removed the pr/needs-maintainer-review This PR needs a review from a Core Team Member label Feb 29, 2024
Copy link
Contributor

mergify bot commented Feb 29, 2024

Thank you for contributing! Your pull request will be updated from main and then merged automatically (do not update manually, and be sure to allow changes to be pushed to your fork).

@GavinZZ
Copy link
Contributor

GavinZZ commented Feb 29, 2024

@mergify update

Copy link
Contributor

mergify bot commented Feb 29, 2024

update

❌ Mergify doesn't have permission to update

For security reasons, Mergify can't update this pull request. Try updating locally.
GitHub response: refusing to allow a GitHub App to create or update workflow .github/workflows/close-stale-prs.yml without workflows permission

@mergify mergify bot dismissed GavinZZ’s stale review February 29, 2024 14:51

Pull request has been modified.

@aws-cdk-automation
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: AutoBuildv2Project1C6BFA3F-wQm2hXv2jqQv
  • Commit ID: ec36779
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@GavinZZ GavinZZ merged commit 05b1bb0 into aws:main Feb 29, 2024
9 checks passed
Copy link
Contributor

mergify bot commented Feb 29, 2024

Thank you for contributing! Your pull request will be updated from main and then merged automatically (do not update manually, and be sure to allow changes to be pushed to your fork).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a bug. distinguished-contributor [Pilot] contributed 50+ PRs to the CDK effort/medium Medium work item – several days of effort p1 pr-linter/exemption-requested The contributor has requested an exemption to the PR Linter feedback.
Projects
None yet
4 participants