Migrate Trident volumes across the K8s cluster using Velero, PVC remains in pending status #957

JacieChao · 2024-12-16T08:37:03Z

Describe the bug
I want to migrate my Stateful applications from one cluster to another, and I used Velero with the CSI plugin to help migrate with Trident CSI volumes. After restoring the application on another cluster, the PVC remains in pending status.

Environment
Provide accurate information about the environment to help us reproduce the issue.

Trident version: v24.10.0
Trident installation flags used: helm install trident helm/trident-operator-100.2410.0.tgz -n trident
Container runtime: containerd v1.7.11-k3s2
Kubernetes version: v1.26.15
Kubernetes orchestrator: Rancher v2.8.7
OS: Ubuntu 22.04
NetApp backend types: ONTAP with AWS FSx

To Reproduce
Steps to reproduce the behavior:

Create a Stateful application on cluster A with Trident CSI.
Install Velero with CSI plugin on cluster A

velero install \
--provider aws \
--plugins velero/velero-plugin-for-aws:v1.10.0 \
--bucket ${s3-bucket-name} \
--backup-location-config region=${bucket-region} \
--snapshot-location-config region=${bucket-region} \
--features=EnableCSI \
--secret-file ./aws-credentials \
--kubeconfig cluster1.yaml

Backup this Stateful application under specified namespace by Velero.
Install Velero with CSI plugin on Cluster B
Restore with the backup on Cluster B by Velero
The PVC on cluster B remains in Pending status

Expected behavior
The PVC can be migrated across the clusters.

Additional context
After migration, I checked the VolumeSnapshot YAML which is the dataSource of migrated PVC. I got the error below:

failed to list snapshot for content velero-test-postgresql-1-jmc97-tcb7d: "rpc error: code = NotFound desc = snapshot content snapshot-53b6a703-68b5-4999-87e3-c2307f4ea932 not found;"

This Trident Snapshot CRD is not created automatically. Then I migrate the specified Trident Snapshot CRD manually to Cluster B. After migrating the Trident Snapshot CRD I can't get snapshots using tridentctl which can be found after restart the Trident controller pod.

Then the PVC got the new error message with:

failed to list snapshot for content velero-test-postgresql-1-jmc97-tcb7d: "rpc error: code = Internal desc = volume pvc-ea2b8b61-c3a9-4c0b-bb62-e676f4e53c4d was not found"

The trident Volume CRD also need to migrate to new cluster. Then I migrate the specified Trident Volume CRD manually to Cluster B. After migrating the Trident Volume CRD I can't get volumes using tridentctl which can be found after restart the Trident controller pod.

Even if I have the same Trident backend on two clusters I can't migrate the stateful application with trident volumes automatically. And the tridentctl and controllers can read the new Trident CRDs that migrate manually after restart the Trident controller instead of CRD changes.

I used Velero to migrate stateful applications with other CSI, and the CSI volumes can be handled properly without manual migrate CRDs.

So can we support migrate Trident Volumes across clusters? Or is there some other guidelines that I need to do to support so?

The text was updated successfully, but these errors were encountered:

wonderland · 2024-12-18T10:06:56Z

In case you only change K8s clusters but stay in the same region and on the same FSxN system, Tridents volume import capability might be the better choice. It doesn't require any data movement and brings the existing PVC into another cluster. Roughly like this:

Make sure PV has a retain policy so it stays around when removing the PVC
Delete the PVC on the source cluster to safely detach it and make sure it is no longer in use
Note the internal name of the volume on FSxN (csi.volumeAttributes.internalName of the PV)
Use tridentctl import on the destination cluster to import the existing volume as a PVC. Note that this will rename the volume in FSxN, which is another safeguard so the source cluster can no longer access it
Clean up the stale PV object on the source cluster by patching it to a Delete policy

See https://docs.netapp.com/us-en/trident/trident-use/vol-import.html for details.

You can also take a look at Trident protect, which can be used for migration use cases, see https://docs.netapp.com/us-en/trident/trident-protect/trident-protect-migrate-apps.html

In case you need to migrate between regions/FSxN systems, the Trident protect replication feature might be a good match as well: https://docs.netapp.com/us-en/trident/trident-protect/trident-protect-use-snapmirror-replication.html

JacieChao · 2024-12-19T08:51:03Z

@wonderland Thanks a lot.
I validated migrating pvc by tridentctl import in my test environment and it works well. I will try other scenarios later and give a feedback about it.

JacieChao added the bug label Dec 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Migrate Trident volumes across the K8s cluster using Velero, PVC remains in pending status #957

Migrate Trident volumes across the K8s cluster using Velero, PVC remains in pending status #957

JacieChao commented Dec 16, 2024

wonderland commented Dec 18, 2024

JacieChao commented Dec 19, 2024

Migrate Trident volumes across the K8s cluster using Velero, PVC remains in pending status #957

Migrate Trident volumes across the K8s cluster using Velero, PVC remains in pending status #957

Comments

JacieChao commented Dec 16, 2024

wonderland commented Dec 18, 2024

JacieChao commented Dec 19, 2024