Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

check if the csi driver is supported the volume group snapshot cap #864

Merged
merged 1 commit into from
Sep 6, 2023

Conversation

carlory
Copy link
Member

@carlory carlory commented Jun 20, 2023

What type of PR is this?

/kind feature

What this PR does / why we need it:

check if the csi driver is supported the volume group snapshot cap when the --enable-volume-group-snapshots arg is true

Special notes for your reviewer:

waiting for csi-lib-utils 0.14.0 to be released

/hold

Does this PR introduce a user-facing change?:

print a warning message at the startup phase if the csi-driver doesn't support GroupControllerCreateVolumeGroupSnapshot when the --enable-volume-group-snapshots flag is true.

@k8s-ci-robot k8s-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. release-note-none Denotes a PR that doesn't merit a release note. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jun 20, 2023
@k8s-ci-robot k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Jun 20, 2023
klog.Errorf("error determining if driver supports create/delete group snapshot operations: %v", err)
os.Exit(1)
}
if !supportsCreateVolumeGroupSnapshot {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the support for this is optional, it probably is not correct to exit in case there is no support for VolumeGroupSnapshots.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion, storage vendors should implement the group controller plugin if they want to support the volume group snapshot feature. And the startup parameters of csi-snapshotter is specified by a storage vendor. So the storage vendor should know how to configure the csi-snapshotter. If they set --enable-volume-group-snapshots to true but they don't implement the group controller plugin, I think it is their fault and they should fix it.

By the way, As @xing-yang said, the controller plugin is also optional for storage vendors. According to the existing code, If the storage vendor doesn't implement the controller plugin or doesn't support the csi.ControllerServiceCapability_RPC_CREATE_DELETE_SNAPSHOT capability, the csi-snapshotter will exit.

If group controller plugin are GAed, another possible solution is that we deprecate the --enable-volume-group-snapshots parameter and the csi-snapshotter will auto-detect the csi plugin whether supports the volume group snapshot feature at startup phase.

What do you think? @nixpanic @xing-yang

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this PR improves the situation, and also with what you write. It is unfortunate that the --enable-volume-group-snapshots parameter is used, and there is no approach for dynamically detecting the support of the functionality.

It is not only storage vendors that deploy CSI-drivers, but also community projects. A mistake in the parameters for a sidecar can cause unclear problems for users. Exiting and causing a failure for a CSI-driver that otherwise works fine is not an optimal user experience in my opinion.

This is a little different from the generic snapshot support. If a CSI-driver does not implement snapshot functionality, it is rather obvious that the external-snapshotter sidecar does not need to be added to a Pod. The difference is that a CSI-driver may choose to implement GroupSnapshots with a new version, and adding --enable-volume-group-snapshots is forgotten.

So, to summarize my opinion:

Adding detection for the capability is great! This also allows you to remove/deprecate the --enable-volume-group-snapshots parameter and simplify using the sidecar with a CSI-driver.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this PR improves the situation, and also with what you write. It is unfortunate that the --enable-volume-group-snapshots parameter is used, and there is no approach for dynamically detecting the support of the functionality.

According to csi spec, all the plugins must be implement the indentity service RPCs.

The GetPluginCapabilities RPC is one of them, so the external-snapshotter controller can query the GetPluginCapabilities RPC to check if the plugin supports the controller service and group controller service.

If the plugin supports the controller service, the external-snapshotter controller should check if the plugin supports the CREATE_DELETE_SNAPSHOT controller
capability via csirpc.GetControllerCapabilities. If not, the external-snapshotter controller should fail to start.

If the plugin supports the group controller service, the external-snapshotter controller should check if the plugin supports
the CREATE_DELETE_GET_VOLUME_GROUP_SNAPSHOT group controller capability via csirpc.GetGroupControllerCapabilities. If not,
the external-snapshotter controller won't enable volume group snapshot feature and set the enableVolumeGroupSnapshots to false.

If the --enable-volume-group-snapshots flag is specified to true and the result of auto-detection is unsupported, the external-snapshotter controller should fail to start. Considering the --enable-volume-group-snapshots flag is used, we can log the warning message to remind the user that the flag
will be deprecated in a future release. After 2 releases, we can remove the flag.

cc @nixpanic @xing-yang

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unlikely that we can deprecate the enable-volume-group-snapshots flag. While volume snapshot is a feature that nearly every CSI driver can support, volume group snapshot is an advanced functionality.

@carlory
Copy link
Member Author

carlory commented Jun 25, 2023

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 25, 2023
}
if !supportsCreateVolumeGroupSnapshot {
klog.Errorf("CSI driver %s does not support GroupControllerCreateVolumeGroupSnapshot", driverName)
os.Exit(1)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not exit. It is very likely that a CSI driver may combine the controller plugin and group controller plugin as one.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xing-yang

If this parameter enable-volume-group-snapshots is specified as true, but during the verification process, it is found that the group snapshot is not supported by the csi-plugins , then how should we deal with it? Print a warning log or exit directly?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Print a warning message but don't exit.

…en the --enable-volume-group-snapshots arg is true
@carlory
Copy link
Member Author

carlory commented Jul 26, 2023

@nixpanic @xing-yang please review again, thanks!

@xing-yang
Copy link
Collaborator

Please add a release note.

@xing-yang
Copy link
Collaborator

xing-yang commented Sep 4, 2023

/lgtm
/approve

/hold
Will cancel hold after release note is added.

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 4, 2023
@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 4, 2023
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: carlory, xing-yang

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-none Denotes a PR that doesn't merit a release note. labels Sep 4, 2023
@carlory
Copy link
Member Author

carlory commented Sep 5, 2023

@xing-yang release note is added.

@xing-yang
Copy link
Collaborator

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 6, 2023
@k8s-ci-robot k8s-ci-robot merged commit 737d5fc into kubernetes-csi:master Sep 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants