Releases: libopenstorage/stork
Release Stork v2.8.0
New Features
- A new driver KDMP has been added to stork for taking generic backups and restores of PVCs of any underlying storage provider. Currently this driver is only supported via Portworx PX-Backup
Note : Generic backup/restore is not supported for k8s 1.22+. To deploy stork 2.8.0 in k8s 1.22+ kindly make sure to disable the kdmp driver in stork specs by adding the following argument: kdmp-controller: false
Improvements
Release Stork v2.7.0
New Features
- Stork now supports kubernetes v1.22 and above.
- [Portworx Driver]: Add support for backups and restores of volumes from a PX-Security enabled clusters. Stork will use the standard auth storage class parameters and annotations to determine which token to use for backing up and restoring Portworx volumes. (#895)
Improvements
- If a storage provider does not have the concept of replica nodes for a volume do not error out the filter request while scheduling pods using such storage provider's volumes. (#897)
storkctl clusterdomain
commands would silently fail if the provided input domain was an invalid one. Now storkctl will fail the operation if the input cluster domain is not one of the stork detected cluster domains. (#898)- [Portworx Driver] Carry over the auth related annotations from MigrationSchedules to actual Migration objects (#899)
- Add short name support for fetching stork VolumeSnapshot and VolumeSnapshotData CR. The following short hand notations can be used to fetch snapshot and snapshot data objects (#907):
kubectl get stork-volumesnapshot kubectl get svs kubectl get stork-volumesnapshotdata kubectl get svsd
Bug Fixes
-
Issue: In-place VolumeSnapshotRestore issued a "force" delete of the pods using the PVC that needs to be restored. The "force" delete of the pods gave a false indication that the PVC is not being used by any apps. The subsequent Restore command used to fail since the volume was still in use.
User Impact: In-place VolumeSnapshotRestore would fail intermittently.
Resolution: Perform a regular delete of the pods and wait for pod deletion while performing in-place VolumeSnapshotRestore.(#878) -
Issue: Backup and/or Restores of namespaces with both CSI and non CSI PVCs would fail since the non CSI PVCs would finish their backup first but the CSI PVCs would take longer time. The successful non CSI PVC backups would update the Backup CR status to successful causing the CSI drivers to prematurely fail their backup.
User Impact: Backups of namespaces that had both CSI and non CSI PVCs would fail.
Resolution: When a backups involves two different drivers make sure each driver handles only their PVCs. (#885) (#892) (#894) (#901) -
Issue: Stork was not updating the finish timestamp on an ApplicationBackup CR in certain failed backup scenarios
User Impact: Prometheus metrics were not reported for failed backups.
Resolution: Set the finish timestamp even when an application backup fails. (#896) -
Issue: Stork would ignore the replace policy and always update the namespace metadata on ApplicationRestore
User Impact: Even if the replace policy on an ApplicationRestore is set to retain Stork would override the annotations on a namespace
Resolution: On ApplicationRestore always set replace the namespace metadata only if the replace policy is not retain. (#896 ) -
Issue:
storkctl activate/update
command would throw an incorrect message on a successful operation.
User Impact:storkctl activate/update
command would throw an incorrect message where it would have set the MongoDB CR's spec.Replicas field tofalse
where it would have actual set the value to0
Resolution:storkctl activate
command now shows a proper message based on the update it did. (#896) -
Issue: CSI Portworx PVs have a volumeHandle field set to volumeID which could change if a failback operation is executed on the source cluster.
User Impact: Applications using CSI PVCs on the source cluster cannot start after a failback operation
Resolution: Fix the volumeHandle field of a CSI Portworx PV during failback migration (#943) -
Issue: The portworx driver in stork did not handle a pod specification which could directly use Portworx PVs instead of PVCs.
User Impact: Stork pods could hit a nil panic and restart when using Portworx PVs directly in pod specification.
Resolution: Fix a nil panic in portworx driver in the GetPodVolumes implementation. (#926) -
Issue: Backups would fails with older versions of stork on GKE 1.21 clusters
User Impact: Backups would fails with older versions of stork on GKE 1.21 clusters
Resolution: Use the new label for zone failure domain while handling GCE PDs. (#930) -
Issue: PX-Security annotations on MigrationSchedule objects were not propagated to respective Migration object
User Impact: Migrations triggered as a part of a MigrationSchedule would fail on a PX-Security enabled Portworx cluster
Resolution: Add the annotations from MigrationSchedule to the respective Migration object (#899) -
Issue: The PVC UID mappings were not updated after a Migration causing CSI PVCs to stay in Unbound state.
User Impact: Portworx CSI PVC would stay in Unbound state after a Migration
Resolution: Update pvc uid mapping while migrating pv objects (#919) -
Issue: Stork would hit a nil panic while handling an ApplicationRestore object if the source namespace have no labels.
User Impact: ApplicationRestores would timeout if namespaces have no labels
Resolution: Stork now handles empty labels on a namespace when performing ApplicationRestores (#918) -
Issue: Stork was not able to handle in-place volume snapshot restore when the Portworx driver was initializing or not able to handle restore requests.
User Impact: VolumeSnapshotRestore would fail if Portworx driver is temporarily unhealthy or is initializing.
Resolution: Stork now waits for Portworx driver to be up and healthy and handlesresource temporarily unavailable
errors from Portworx and does not fail the restore request. (#875)
Docker Hub Image: openstorage/stork:2.7.0
Release Stork v2.6.5
Bug Fixes
-
Issue: In-place VolumeSnapshotRestore issued a "force" delete of the pods using the PVC that needs to be restored. The "force" delete of the pods gave a false indication that the PVC is not being used by any apps. The subsequent Restore command used to fail since the volume was still in use.
User Impact: In-place VolumeSnapshotRestore would fail intermittently.
Resolution: Perform a regular delete of the pods and wait for pod deletion while performing in-place VolumeSnapshotRestore. Perform a force delete if regular pod deletion fails. (#878) -
Issue: On certain cloud providers after deleting a Kubernetes the associated ports & ips are not released immediately. This caused migrations of service objects to fail since as a part of migration the service objects are deleted and recreated.
User Impact: Migrations would fail intermittently.
Resolution: Stork will not recreate service on destination/DR cluster if it's not changed on primary cluster during migration. (#874)
Release Stork v2.6.4
New Features
- Added storkctl command to create bidirectional cluster-pair on source and destination cluster. (#787)
storkctl create clusterpair testpair -n kube-system --src-kube-file <src-kubeconfig> --dest-kube-file <dest-kubeconfig> --src-ip <src_ip> --destip <dest_ip> --src-token <src_token> --dest-token <dest_token>
ClusterPair testpair created successfully on source cluster
ClusterPair testpair created successfully on destination cluster
- Added NamespacedSchedulePolicy which can be used in all the Schedule objects. All the schedules will try to find a policy in the namespace first. If it doesn't exist it'll try to use the cluster scoped policy (#832 )
- Added support for doing custom resource selection per namespace when backing up multiple namespaces as a part of single ApplicationBackup. (#848)
- Added support for backup and migration of MongoDB Community CRs. (#856)
Improvements
- Do not fail application backups when the storage provider returns a busy error code, instead backoff and retry the operation after sometime. (#847 )
- Added support for backing up NetworkPolicy and PodDisruptionBudget objects (#841 )
- While applying the ServiceAccount resources as a part of ApplicationResource or Migrations merge the annotations from the source and destination. (#844) (#858)
- Use annotation
stork.libopenstorage.org/skipSchedulerScoring
on the PVCs whose replicas should not be considered while scoring nodes in stork scheduler. (#846) - ApplicationClone now set
ports
field of service to nil/empty before cloning to destination namespace. (#870) - Create a ConfigMap with details about the current working stork version. (#855)
- Add an option
skipServiceUpdate
to skip updating service objects during ApplicationBackups. (#844)
Bug Fixes
-
Issue: During migrations, if the size of a PVC on the source cluster got changed it did not get reflected on the corresponding PVC on the target cluster.
User Impact: Target PVCs showed incorrect size even if the backing volume had the updated size.
Resolution: In every migration cycle, the PVC on the target cluster will get updated. (#835) -
Issue: When using static IPs in Kubernetes service objects, after migration stork would clear out the IP field from the service object.
User Impact: On ApplicationBackup/Restore of service objects that use static IPs, on restore the service objects would loose the IP.
Resolution: ApplicationBackup CR now has a new boolean fieldskipServiceUpdate
that can be used to instruct stork to skip any Kubernetes service object processing. If set totrue
, stork will migrate the service resource as-is to the target cluster. (#844 ) -
Issue: ApplicationClones were failing if the target namespace already exists.
User Impact: ApplicationClones would fail when the target namespaces where the applications need to be cloned already exists. (#834)
Resolution: Ignore the AlreadyExists error when checking target namespaces instead of marking the ApplicationClone failed. (#834 ) -
Issue: Stork webhook controller would not work since the associated certificate did not have a valid SAN set.
User Impact: With stork webhook enabled, stork would not automatically set theschedulerName
on pods using stork supported storage driver PVCs.
Resolution: Stork now creates a valid cert and it will also update the certs that were already created. Kubernetes API server will recognize the stork webhook controller's cert as valid and forward the webhook requests.(#859 ) -
Issue:
storkctl activate
command would panic with the following errorcannot deep copy int32
User Impact: storkctl activate would fail to failover the applications on a target cluster.
Resolution:storkctl activate
command now uses the right replica field value while activating migrations. (#862)
Docker Hub Image: openstorage/stork:2.6.4
Release Stork v2.6.3
New Features
- Added support to watch newly registered CRDs and auto create
ApplicationRegistration
resource for them. (#792) - Add suspend option support for the following new apps (#811) :
- perconadb
- prometheus
- rabbitmq
- kafka (strimzi)
- postgress(acid.zalan.do)
Improvements
- Added support for specifying ResourceType in ApplicationBackup that allows selecting specific resources while backing up a namespace. (#799)
- Support of pvc object migration in case of migration spec with disable application resource migration. (#783)
- All stork prometheus metrics will now have
stork_
as prefix, eg. stork_application_backup_status, stork_migration_status, stork_hyperconverged_pods_total etc. All existing metrics will now move over tostork_
prefix convention. (#816)
Bug Fixes
- Configure side-effect parameter for stork webhook. This will allow running kubectl --dry-run command when stork webook is enabled. (#802)
- Restores of certain applications would partially succeed in k8s 1.20 due to the addition of a new field ClusterIPs to Services. This new field is now handled in restores. (#800)
- Allow excluding resources from GetResources API. This allows handling of ApplicationRestores in k8s 1.20 and above where certain resources like kube-root.ca config map are always created when a new namespace is created. (#798)
- Added support to watch newly registered CRDs and auto create
ApplicationRegistration
resource for them. (#792) - Fixed issue with failed snapshots being retried. Printing correct error state for snapshots (#786)
Docker Hub Image: openstorage/stork:2.6.3
Release Stork v2.6.0
New Features
- Added the ability to backup and restore CSI volumes (#697).
- Added Prometheus metrics and Grafana dashboards for the scheduler extender and health monitor (#710).
Improvements
- Added a mechanism to change the log level during runtime. You can do this by adding the log level to
/tmp/loglevel
and then sending SIGUSR1 to the process (#735). - Pruned migrations that are in
PartialSuccess
state. Previously, Stork retained all migrations in this state, causing theMigrationSchedule
object to become very large (#742).
Bug Fixes
- Stork now triggers volume backups in batches of 10 and retries update failures due to conflicts. This behavior prevents issues where a backup is triggered, but Stork fails to update the CR (#707).
- Added support for disabling the cronjob object upon migration to a remote cluster. You can now also activate/deactivate a cronjob object using the
storkctl activate/deactivate migration <migration_namespace>
command (#731). - Stork now fetches ApplicationRegistration objects only once when preparing resources. Previously, Stork made multiple calls to the kube-apiserver, causing delays during migration and backup (#732).
- Fixed issue where cluster-scoped CRs were being collected for migration and backup (#732).
- Fixed the volume size for EBS backups, which were previously incorrectly reported in GiB (#733).
Docker Hub Image: openstorage/stork:2.6.0
Release Stork v2.5.0
New Features
- Added support for specifying
*
as a wildcard to select all namespaces in theApplicationBackup
andApplicationBackupSchedule
specs (#693). - Added support for specifying which resource to restore using
ApplicationRestore
. By default, all resources in the namespace will be restored (#652). - Added support for specifying which resource to back up using
ApplicationBackup
. By default, all resources in the namespace will be backed up (#706).
Improvements
- Added
Actual Size
andTotal Size
toApplicationBackups
andApplicationRestores
) (#645, #655, #702). - Added
--watch/-w
option tostorkctl
to watch for resource update (#669). - Added
validate
subcommand tostorkctl
to validateMigration
specs (#669). - Added
--storageoptions/-s
tostorkctl generate clusterpair
which can be used to pass inkey=value
pairs for the storage option in theClusterPair
spec. (#669). - Retain namespace annotations during
ApplicationRestore
,ApplicationClone
andMigration
(#683). - Added the container name field in the
Rule
spec. You must use this field if you have multiple containers in a pod and want to select a container to run the command in (#693). - Added support to collect
LimitRange
forMigrations
,ApplicationBackups
, andApplicationClones
(#694). - For AWS S3, if
s3.amazonaws.com
is passed in as the endpoint inBackupLocation
, it'll be updated to the region-specific endpoint automatically (#699). BackupLocation
can now be passed in as an option when creating aClusterPair
. You can use this to specify an objectstore for Stork to use duringMigration
. Stork will create theBackupLocation
on the destination cluster if not present (#704).- Stork now detects all registered CRDs and creates an
ApplicationRegistration
for them (#705). - Portworx: You can now delete local snapshots by passing the
portworx.io/cloudsnap-delete-local-snap: true
annotation toSnapshot
andApplicationBackup
objects (#711).
Bug Fixes
- Added a check for
NotFound
when restoring AWS, Azure, and GCP disks (#696). - When running an
ApplicationRestore
, Stork now only creates namespaces that are being restored (#700). - Stork now skips namespace check-in of an
Applicationbackup
if it is in theFinal
stage (#700). - Fixed an issue in
storkctl
with suspending/resuming multipleMigrationSchedules
(#721). - During
Migration
, Stork now waits for objects to be deleted before creating them again (#721).
Docker Hub Image: openstorage/stork:2.5.0
Release Stork v2.4.5
Bug Fixes
- Portworx: Cluster management APIs have been updated to use auth tokens when PX-Security is enabled. This is required when using PX-Enterprise v2.6.0 with auth enabled (#647)
Docker Hub Image: openstorage/stork:2.4.5
Release Stork v2.4.4
Improvements
- Added support to specify string options in ApplicationRegistration for scaling applications up/down (#692)
Docker Hub Image: openstorage/stork:2.4.4
Release Stork v2.4.3
Improvements
- Added support to specify CRDs that should be collected for Migration/ApplicationClone/ApplicationBackup. These can be registered using ApplicationRegistration (#653, #654)
Bug Fixes
- Fixed issue in
ApplicationBackup
controller where updates to objects would cause conflicts (#650) - PVCs being deleted or in
Pending
state will now be ignored duringApplicationBackup
(#665) - ResourceQuota objects are now collected during
Migration
,ApplicationClone
andAppplicationBackup
(#671) - Fixed issue when restoring or cloning
RoleBinding
where the namespace would not get mapped to the destination namespace (#671) - Fixed issue where the Pre/Post Exec Rules were not being passed in from
VolumeSnapshotSchedule
toVolumeSnapshot
(#675) - Fixed issue during
ApplicationRestore
where resources to be replaced were being auto-created by another controller. We now ignore such objects if they were created after we deleted them for the restore (#674)